# Changelog

All notable changes to the Skraak project are documented here.

## [2026-05-05] Restore `--size` flag to `calls classify`

Re-added the `--size` CLI flag to `skraak calls classify` for controlling
spectrogram display size. Removed earlier as a complexity reduction, but
needed back because the default 448px is too small on Retina displays.

The flag overrides `img_dims` from config when specified (224-896px range).
No complexity increase — reuses existing `requireIntRange` and a trivial
`classifyImageSize` helper (complexity 2).

## [2026-05-07] Remove utils → db package dependency

Three changes to eliminate the `utils → db` import edge, restoring clean layering
(`cmd → tools → utils` and `cmd → db`, with no upward dependency from utils):

- **Moved `Placeholders()` from `db/` to `utils/`**: It's a pure string function
  (generates `?, ?, ?` for SQL IN clauses) with no database dependency. `db.Placeholders`
  now delegates to `utils.Placeholders` for backward compatibility.

- `cmd/calls_classify.go` — Added `--size` flag to `classifyArgs`, parse switch,
  and `classifyImageSize()` override helper
- No changes to `tools/calls_classify.go` or `tui/classify.go` (already wired)

- **Moved `GainLevel` type and constants from `db/` to `utils/`**: It's a string enum
  that naturally belongs next to `AudioMothData` in the audiomoth parser. `db.GainLevel`
  is now a type alias (`type GainLevel = utils.GainLevel`) and the constants are
  re-exported for backward compatibility.

- **Refactored `ImportCluster` to accept `*sql.Tx` instead of managing its own
  transaction**: The caller (tools/) now opens the `LoggedTx`, passes the underlying
  `*sql.Tx` via `UnderlyingTx()`, and manages commit/rollback. This removes
  `db.BeginLoggedTx`, `db.LoggedTx`, and `db.LoggedStmt` from utils entirely.
  Added `UnderlyingTx()` method to `db.LoggedTx` to expose the raw `*sql.Tx`.

- **Extended `DB` interface** with `Exec()`: `EnsureClusterPath` needed `Exec` which
  was missing from the original `DB` interface. Both `*sql.DB` and `*sql.Tx` satisfy
  the updated interface.

Result: `utils/` no longer imports `skraak/db`. The dependency now flows correctly:
`db → utils` (for Placeholders and GainLevel re-exports).

## [2026-05-05] tools/ refactoring: WithWriteTx, CallSource interface, hierarchy primitives, strconv

Four refactoring changes in tools/ and db/:

- **Added `db.WithWriteTx` and `db.WithReadDB` helpers** (db/db.go): Extracted the
  open-DB→begin-tx→defer-rollback→commit→close-DB boilerplate that was repeated across
  14+ tool entry points. `WithWriteTx(ctx, dbPath, name, fn)` opens a writeable DB, begins
  a logged transaction, calls fn, and commits on success / rollbacks on error. `WithReadDB`
  does the same for read-only connections. Applied to all create/update functions in
  cluster, dataset, location, and pattern, plus import_unstructured, import_files (validation),
  sql, and bulk_file_import (validation). Eliminates inconsistent rollback handling
  (some used `defer tx.Rollback()`, others `defer func() { if err != nil { tx.Rollback() } }()`)
  and removes ~100 lines of boilerplate.

- **Introduced `CallSource` interface** (tools/calls_from_common.go): Extracted shared
  scaffolding from `calls_from_raven.go` and `calls_from_birda.go` into a `CallSource`
  interface with `Name()`, `FindFiles()`, and `ProcessFile()` methods. Both files now
  implement the interface and delegate to `callsFromSource()`, which handles the
  sequential/parallel dispatch, DirCache management, worker pool, and result aggregation.
  Public API (`CallsFromRaven`, `CallsFromBirda`) unchanged. ~100 lines saved.

- **Extracted hierarchy validation primitives** (db/validation.go): Added `Querier`
  interface, `DatasetExistsAndActive()`, `LocationBelongsToDataset()`, and
  `ClusterBelongsToLocation()` to db/validation.go. Replaced three near-duplicate
  functions: `validateSegmentHierarchy` (import_segments.go, 14→7 cyclomatic),
  `validateHierarchyIDs` (import_files.go, 14→7 cyclomatic), and
  `verifyClusterParentRefs` plus its helpers `verifyDatasetForCluster` and
  `verifyLocationForCluster` (cluster.go, deleted ~50 lines). Also replaced inline
  dataset-exists-and-active checks in bulk_file_import.go and import_unstructured.go.

- **`fmt.Sscanf("%f", ...)→ strconv.ParseFloat` and deleted `extractFilename`**: Replaced
  7 `fmt.Sscanf` calls in calls_from_raven.go and calls_from_birda.go with
  `strconv.ParseFloat` (faster, idiomatic). Deleted `extractFilename()` from
  calls_from_preds.go — it was a one-line wrapper over `filepath.Base`.

Functions at cyclomatic >13 reduced from 9 to 6.

## [2026-05-05] Utils refactoring: dedup types, fix error wrapping, consolidate WAV parsers, typed ImportStage

Four focused refactoring changes in utils/ and tools/:

- **Merged `fileData` into `FileProcessingResult`**: The two identical structs
  (8 fields, same order, same types) in `cluster_import.go` and `file_import.go`
  have been unified. `cluster_import.go` now uses the exported
  `FileProcessingResult` everywhere. Removes a type, reduces mental overhead.

- **Fixed `%v``%w` in `insertSingleFile`**: Five `fmt.Errorf` calls in
  `cluster_import.go::insertSingleFile` used `%v` instead of `%w`, breaking
  `errors.Is`/`errors.As` chains. Now consistent with the rest of the file.

- **Consolidated `ParseWAVHeader*` boilerplate**: Extracted `readAndParseHeader`
  helper that handles open→stat→read→parse→set-modtime. `ParseWAVHeader` and
  `ParseWAVHeaderMinimal` are now thin wrappers (2-4 lines each).
  `ParseWAVHeaderWithHash` keeps its own open+hash logic (needs the file handle
  for streaming). Removed now-unused `parseWAVMinimal`.

- **Typed `ImportStage` constants**: Defined `ImportStage` type with constants
  (`StageScan`, `StageHash`, `StageParse`, `StageProcess`, `StageValidation`,
  `StageInsert`, `StageImport`) in `file_import.go`. Both `FileImportError`
  and `ImportSegmentError` now use `ImportStage` instead of `string`.
  All usages in `cluster_import.go`, `import_segments.go`, and
  `import_unstructured.go` updated. Eliminates typos and clarifies the stage
  set in one place.

## [2026-05-05] Reduce cyclomatic complexity across codebase, add cyclop linter gate

Added `cyclop` linter to `.golangci.yml` with `max-complexity: 15` (CI gate)
and `package-average: 8.0`. Test files, `main()`, and `RunCalls()` dispatch
switches are excluded (cyclop overcounts trivial switch dispatches).

Refactored 11 functions from 15-19 complexity down to ≤10:

- **parseClipArgs** (19→~3): Replaced manual switch/case arg parser with
  `flag.FlagSet`, consistent with all other `cmd/` functions.
- **callsFromBirdaParallel** (17→~8): Extracted `aggregateResults()` and
  `sortCallsByFileAndTime()` into `tools/parallel_aggregate.go`.
- **callsFromRavenParallel** (17→~8): Same pattern, shared via `parallelResult`
  interface on both `birdaResult` and `ravenResult`.
- **BulkFileImport** (17→~8): Extracted `bulkCreateClusters()`,
  `bulkImportAllFiles()`, `bulkValidateLocations()`, and `failOutput()` helper.
- **saveClip** (16→~7): Extracted `buildClipPaths()`, `generateClipSpectrogram()`,
  and `writeClipPNG()`.
- **RunLocationUpdate** (15→~8): Extracted `parseLocationUpdateInput()`.
- **DetectAnomalies** (15→~8): Extracted `validateAnomalyInput()`.
- **LoadDataFiles** (15→~6): Extracted `parseAndSortDataFiles()`,
  `filterDataFiles()`, and `buildClassifyState()`.
- **updateCluster** (17→~10): Extracted `validateClusterUpdateInput()` and
  `validateClusterCyclicPattern()`.
- **IsNight** (18→~10): Extracted `populateSunTimes()`.
- **createPattern/updatePattern** (16-18→~10): Extracted
  `validateCreatePatternInput()`, `validateUpdatePatternInput()`, and
  `findExistingPattern()`.
- **SegmentMatchesFilters** (16→~6): Extracted `labelMatchesFilters()`.

Also removed the now-unnecessary `clipArgParser` type and its three methods.

## [2026-05-05] Replace --lat/--lng/--timezone with --location across CLI; remove --wav-only

**Breaking CLI change.** The `--lat`, `--lng`, `--timezone` flags are replaced
by a single `--location "lat,lng[,timezone]"` flag in three commands:

- `skraak calls clip`
- `skraak calls classify`
- `skraak calls push-certainty`

Timezone is optional (defaults to UTC; not needed for AudioMoth).
This makes invalid states unrepresentable (you can't pass lat without lng).

Before:
```bash
skraak calls clip --folder ./data --output ./clips --prefix kiwi \
  --species Kiwi --night --lat -40.85 --lng 172.81 --timezone Pacific/Auckland
```
After:
```bash
skraak calls clip --folder ./data --output ./clips --prefix kiwi \
  --species Kiwi --night --location "-40.85,172.81,Pacific/Auckland"
```

Additionally, `--wav-only` was removed from `skraak calls clip`. This flag
skipped spectrogram PNG generation; removed as unnecessary since the tool's
primary purpose is generating training data (PNG+WAV pairs).

A shared `utils.ParseLocation` helper was added to avoid duplicating the
`lat,lng[,tz]` parse logic across commands.

**Files changed:**
- `utils/location.go` — new `ParseLocation` helper
- `cmd/calls_clip.go``--location` flag, removed `--wav-only`
- `cmd/calls_classify.go``--location` flag
- `cmd/calls_push_certainty.go``--location` flag
- `tools/calls_clip.go``CallsClipInput` struct, removed `wavOnly`/`parseLocation`
- `tools/calls_clip_bench_test.go` — removed `BenchmarkFullPipelineWavOnly`

## [2026-05-04] Reduce cyclomatic complexity of 8 more functions over gocyclo 25

Refactored 8 functions that exceeded cyclomatic complexity of 25 by extracting
helper functions with clear responsibilities:

1. **`processBirdaFileCached` (30→~10)**: Extracted `parseBirdaCSVHeader`,
   `readBirdaDetections`, `resolveBirdaWAVPath` for CSV parsing and WAV path
   resolution.

2. **`RunCallsModify` (30→~5)**: Extracted `modifyArgs` struct,
   `parseModifyArgs`, `requireFlagValue`, `validateModifyArgs` for CLI argument
   parsing and validation.

3. **`CallsModify` (29→~10)**: Extracted `validateModifyInput`, `resolveSpecies`,
   `hasModifyChanges`, `applyLabelChanges`, `findLabelByFilter` for input
   validation, species resolution, and label update logic.

4. **`CallsClipLabels` (29→~15)**: Extracted `parsedClipFile` type,
   `validateClipLabelsInput`, `parseClipLabelsDataFiles`,
   `dedupClipLabelsRows` for parameter validation, file parsing, and
   deduplication.

5. **`LoadDataFiles` (28→~10)**: Extracted `findDataFilePaths`,
   `filterDataFileSegments` for file discovery and segment/day-night filtering.

6. **`ReadWAVSegmentSamples` (27→~5)**: Extracted `wavChunkInfo` type,
   `parseWAVChunks`, `calcWAVReadRange` for WAV chunk parsing and read range
   calculation. Eliminated `goto` statement.

7. **`importSegmentsIntoDB` (27→~10)**: Extracted `importLabelResult` type,
   `importSingleLabel`, `importCalltype`, `importSegment` for per-label and
   per-segment DB insertion.

8. **`updateCluster` (27→~17)**: Extracted `validateClusterActive`,
   `validateCyclicPattern`, `buildClusterUpdateQuery` for cluster validation
   and dynamic query building.

## [2026-05-04] Reduce cyclomatic complexity of 8 functions over gocyclo 30

Refactored 8 functions that exceeded cyclomatic complexity of 30 by extracting
helper functions with clear responsibilities:

1. **`CallsPropagate` (39→6)**: Extracted `validatePropagateInput`, `hasBothFilters`,
   `collectPropagateSources`, `propagateTargets`, `findUpdatableTargetLabel`,
   `findOverlappingSources`, `resolveCallType`, `buildConflictRecord`, `applyPropagation`.

2. **`CallsSummarise` (38→5)**: Extracted `summariseFiles`, `trackMeta`, `filterLabels`,
   `buildLabelSummaries`, `updateStatsFromLabels`, `updateFilterStats`,
   `updateReviewStatus`, `finaliseSummary`.

3. **`runCallsPushCertainty` (35→7)**: Extracted `parsePushCertaintyArgs`,
   `requireValue`, `requireFloat`, `validatePushCertaintyFlags`.

4. **`RunCallsClip` (35→2)**: Extracted `parseClipArgs`, `validateClipFlags`,
   `nextUniqueValue` on clipArgParser.

5. **`createCluster` (34→19)**: Extracted `validateCreateClusterFields`,
   `validateCreateClusterIDs`, `verifyDatasetForCluster`, `verifyLocationForCluster`,
   `verifyPatternExists`, `findExistingClusterInLocation`, `fetchClusterByID`.

6. **`ValidateMappingAgainstDB` (32→5)**: Extracted `collectMappedLabels`,
   `validateMappedSpecies`, `validateMappedCalltypes`.

7. **`CallsFromPreds` (32→8)**: Extracted `readPredCSV`, `findPredCSVColumns`,
   `readPredCSVRows`, `addDetectionsFromRow`, `clusterDetections`.

8. **`processRavenFileCached` (31→10)**: Extracted `parseRavenHeader`,
   `parseRavenSelections`, `parseRavenRow`, `deriveWAVBaseName`, `resolveWAVPath`.

## [2026-05-04] Reduce cyclomatic complexity of marshalParam, RunCallsClip, TestParseWAVHeader

Three functions exceeded gocyclo threshold of 40:

1. **`marshalParam` (db/tx_logger.go) — 50 → ~8**: Replaced 25 repetitive pointer-type
   cases (`*int`, `*string`, `*float64`, etc.) with the existing reflection-based pointer
   handling that was already in the `default` branch. All value types kept as a single
   multi-type switch case.

2. **`RunCallsClip` (cmd/calls_clip.go) — 50 → 35**: Extracted `clipArgParser` struct
   with `nextValue()`, `nextInt()`, `nextFloat()` helpers that encapsulate the
   "check bounds + advance + error" pattern repeated 13 times in the arg loop.

3. **`TestParseWAVHeader` (utils/wav_metadata_test.go) — 44 → ~5 each**: Split the
   monolithic test with 14 `t.Run` sub-tests into 14 top-level `TestParseWAVHeader_*`
   functions. Each now has minimal complexity.

## [2026-05-04] Reduce cyclomatic complexity of handleKey (51 → 6)

Refactored `Model.handleKey` in `tui/classify.go` from a single 51-complexity
monolith into 6 focused methods plus 2 utility helpers:

- `handleKey` (6) — dispatcher routing to mode-specific handlers
- `handleSecondaryWait` (6) — awaiting-secondary-calltype logic
- `handleSpecialKey` (9) — Enter/Esc/Space/Ctrl+S
- `handleSwitchKey` (14) — switch-based key dispatch
- `handleBindingKey` (9) — single-char species/calltype bindings
- `stopPlayer` — extracted repeated nil-check + Stop pattern
- `advanceOrQuit` — extracted repeated next-segment-or-quit pattern

No behavioral changes.

## [2026-05-02] Add --species Species+_ to filter for no calltype

Added support for `_` as a sentinel calltype in `--species` to match only
labels with an empty calltype. For example, `--species Kiwi+_` returns only
Kiwi calls that have no calltype assigned, while `--species Kiwi` continues
to return all Kiwi calls regardless of calltype.

**Changes:**
- `utils/data_file.go`: Added `CallTypeNone` sentinel constant; updated
  `SegmentMatchesFilters` to check for empty calltype when `CallTypeNone`
  is used; updated `ParseSpeciesCallType` doc comment
- `utils/data_file_test.go`: Added test case for `Kiwi+_` parsing
- `tools/calls_classify_filter_test.go`: Added `TestCallTypeNoneFiltering`
- `cmd/calls_classify.go`: Updated help text to document `+_` syntax

## [2026-05-01] Add integration tests for cluster_import and mapping validation

Added shell script integration tests covering previously untested code paths
in `utils/cluster_import.go` and `utils/mapping.go`.

**New files:**
- `shell_scripts/test_cluster_import.sh` — 11 tests exercising ImportCluster,
  GetLocationData, EnsureClusterPath, batchProcessFiles, insertClusterFiles
- `shell_scripts/test_mapping_validation.sh` — 6 tests exercising
  ValidateMappingAgainstDB via `import segments` CLI

**Infrastructure:**
- Added `generate_wav()` helper to `test_lib.sh` (uses python3 to create
  valid minimal WAV files for test fixtures)

**Bug fix:**
- `ValidateMappingAgainstDB` now skips `__NEGATIVE__` and `__IGNORE__`
  sentinels when building the set of species to validate against the DB.
  Previously these sentinels were incorrectly queried in the species table,
  causing a false validation error.

**Skipped:**
- `utils/audio_player.go` — thin `oto` wrapper requiring audio hardware;
  no testable logic independent of the external package. Documented in
  CLAUDE.md as intentional skip.

## [2026-04-28] Remove MCP server support

**Breaking change:** Removed the MCP (Model Context Protocol) server entirely.
All functionality remains available via CLI commands.

- Deleted `cmd/mcp.go` (MCP server + adapters)
- Deleted `cmd/mcp_surface_test.go` (MCP integration tests)
- Deleted `resources/` package (only served MCP schema resource)
- Removed `case "mcp"` from `main.go` dispatch
- Removed `jsonschema` struct tags from all `tools/*.go` (126 tags across 24 files)
- Removed `github.com/modelcontextprotocol/go-sdk` dependency and transitive deps
- Fixed stale "Map to MCP output format" comment in `tools/import_files.go`

Rationale: CLI provides full access to all tools with JSON output for Unix
composability. The MCP server was a parallel access path with no unique
capabilities.

## [2026-04-27] Performance: DirCache + worker pool for `from-raven` and `from-birda`

`calls from-raven` and `calls from-birda` were extremely slow on large
folders (57k files ≈ 2 hours). Root cause: `findWAVFile()` performed
`os.ReadDir()` on every file — O(N²) directory scans. Fix:

1. **DirCache**: Scan directory once, build `map[string]string` for
   O(1) WAV lookup. Eliminates the dominant bottleneck (57k × 57k = 3.25B
   comparisons → 1 scan + 57k map lookups).

2. **Worker pool**: 8 parallel goroutines for I/O-bound processing
   (WAV header reads, .data writes). Same pattern as `from-preds`.

3. Both commands auto-select sequential (< 10 files) vs parallel path.

Expected improvement: 2 hours → 2–5 minutes on 57k files.

`DirCache` is also available for `from-preds` but not yet wired in
(that command already uses a worker pool and typically processes fewer
unique directories).

## [2026-04-27] Add `calls clip-labels` subcommand

New `skraak calls clip-labels` exports a CSV in OpenSoundScape's
`clip_labels` format directly from `.data` files — same row layout as
`BoxedAnnotations.clip_labels()`, byte-identical CSVs — but in Go, fast,
and without round-tripping through Raven `selections.txt`.

For every `.data` file in `--folder`, generate clip windows over
`[0, Duration]` using a Go port of OPSO's `generate_clip_times_df`
(`utils/clip_times.go`, supports `final_clip ∈ {full, remainder, extend,
none}`). Every window is emitted as a row. For each output class column,
the value is `True` when at least one certainty=100 annotation of that
class overlaps the window by ≥ `--min-label-overlap` seconds, else
`False`. Gaps emit all-`False` rows. Booleans capitalized to match
pandas' default; times rendered with at least one decimal place.

Only certainty=100 labels participate (cert<100 is ignored).
`mapping.json` (from the `/data-mapping` skill) translates `.data`
species names to canonical class names. Two sentinels with distinct
semantics:
- `__NEGATIVE__` — clip emitted, all class columns False; overrides any
  positives in the same clip. Requires certainty=100. For confirmed-negative
  training examples (rain, wind, silence, helicopter, etc.).
- `__IGNORE__` — the **entire file** is dropped from output. Any segment
  whose species maps to `__IGNORE__` triggers the drop, regardless of
  certainty or filter. For files whose annotation set is incomplete (e.g.
  `Don't Know` regions): emitting any clip from them as confirmed-False
  would poison the training set with possibly-wrong negatives.

`--filter F` restricts which ML filter's labels count
(`opensoundscape-multi-1.0`, `BirdNET`, `Raven`, …); the mapping coverage
check also restricts to that filter.

Fail-fast: any `.data` parse error, missing `Duration`, missing mapping
entry, or duplicate `(file, start_time, end_time)` row aborts the run
before the CSV is written. Existing output files are appended; column-set
mismatch hard-errors.

Adds `MappingNegative`/`MappingIgnore` sentinels, `Classify`,
`ValidateCoversSpecies`, and `Classes` to `utils/mapping.go`. Adds
`utils/clip_times.go` with the OPSO clip-times port and unit tests
covering all four `final_clip` modes. Verified against an OPSO reference
output on a 100-file Raven test folder: byte-identical CSVs.

## [2026-04-26] Drop `schema://table/{name}` resource

Keeps `schema://full` and removes the per-table schema resource template,
along with its line-based extractor (paren counting, view-vs-table branching,
manual index/ALTER append) and the table-name allowlist. The full schema is
241 lines — small enough that splitting it adds parsing surface for no real
benefit, and clients can also introspect via DuckDB
(`information_schema.columns`, `DESCRIBE`, etc.) through `execute_sql`.

Updates `shell_scripts/test_resources.sh` to drop per-table tests and the
resource-template list call.

## [2026-04-26] Remove `prompts` package

Deletes `prompts/examples.go` and the six MCP prompts it registered
(`query_active_datasets`, `explore_database_schema`,
`explore_location_hierarchy`, `query_location_data`, `analyze_cluster_files`,
`system_status_check`). Drops the `skraak/prompts` import and `AddPrompt` calls
from `cmd/mcp.go`.

Motivation: the prompts were never invoked in practice. Models write SQL
fluently from the `schema://*` resources alone, so the canned templates added
maintenance surface without earning their keep. The `system_status_check`
prompt was self-referential (its body listed the prompts being removed) and
duplicated coverage already in `cmd/mcp_surface_test.go`.

Also drops `shell_scripts/test_prompts.sh` and the prompt references in
`shell_scripts/README.md` and `shell_scripts/TESTING.md`.

## [2026-04-22] `calls summarise`: Add --filter flag to restrict output to a single filter

Adds `--filter <name>` to `skraak calls summarise`. When specified, only labels
matching that filter are included in stats, segments, and review counts.
Segments with no matching labels are omitted entirely. Empty filter (default)
behaves as before (all filters included).

Motivation: a folder of .data files may contain multiple ML model filters;
summarising all of them makes it hard to inspect one. `--filter` scopes the
output the same way `classify --filter` scopes the TUI.

## [2026-04-22] `calls classify`: Shift+primary secondary keybindings for calltype editing

Adds a per-species secondary-binding layer to the classify TUI. Primary flow is
unchanged (keypress → label → save → advance). When a primary key has
`secondary_bindings` configured, pressing **Shift+primary-key** labels the
species with an empty calltype, skips the auto-advance, and enters a one-shot
wait state; the next keypress is looked up in the secondary map and sets the
calltype before advancing. Esc exits the wait state without advancing. Any
non-matching key falls through to normal handling.

Motivation: species like common chaffinch have multiple calltypes (alarm,
contact, song) that couldn't be assigned without burning extra keybindings on
every species. Secondary bindings are per-species (not global) to avoid
accidental mislabels, and deliberately unlisted in the help bar — users know
their own config.

Example config:
```json
"classify": {
  "bindings": { "c": "comcha" },
  "secondary_bindings": {
    "c": { "a": "alarm", "s": "song", "n": "contact" }
  }
}
```

Shift+primary on a key with no `secondary_bindings` entry falls back to normal
primary behavior, so existing configs are unaffected.

**Files changed:**
- `utils/config.go` — new `SecondaryBindings` field on `ClassifyFileConfig`.
- `cmd/calls_classify.go` — validation (outer key must exist in bindings,
  inner keys single-char non-reserved, values non-empty) and passthrough to
  `ClassifyConfig`.
- `tools/calls_classify.go``SecondaryBindings` field on `ClassifyConfig`,
  new `ApplyCallTypeOnly` and `HasSecondary` methods.
- `tui/classify.go``awaitingSecondaryFor` model field, wait-mode intercept
  at top of `handleKey`, Shift+letter detection in the default branch, ``
  indicator on the segment info line while waiting.

## [2026-04-18] `--day` redefined as civil dawn → solar sunset (includes dawn chorus)

`--day` previously filtered to solar day (sunrise → sunset), excluding the dawn chorus.
Changed to civil dawn → solar sunset so diurnal species active at dawn are included.

`--night` (solar night) is unchanged. The dawn-chorus window (civil dawn → solar sunrise)
is now covered by **both** flags — a recording at that time is `solar_night=true` and
`diurnal_active=true`. Correct: kiwi and diurnal bird-song both overlap at dawn.

`IsNightOutput` gains a new `diurnal_active` field (bool, present in JSON output of
`skraak isnight`) computed as `midpoint >= civil_dawn && midpoint <= solar_sunset`.

**Files changed:** `tools/isnight.go`, `tools/calls_clip.go`, `tools/calls_classify.go`

## [2026-04-18] `calls classify --night` / `--day`: filter TUI to solar-night or solar-day recordings

Adds `--night`, `--day`, `--lat`, `--lng`, and `--timezone` flags to `skraak calls classify`.
Filtering happens at load time (before the TUI launches) inside `LoadDataFiles`, after the
existing segment filter — so `IsNight` is only called for files that have matching segments.
Skipped file count is reported to stderr before the TUI starts.

Same `--timezone` caveat as `calls clip`: required for non-AudioMoth recorders (e.g. DOC AR4)
that embed local time in filenames. AudioMoth files don't need it.

```bash
skraak calls classify --folder F09/2026-04-06/ --species "Don't Know" \
  --night --lat -45.50603 --lng 167.47371
```

**Files changed:**
- `tools/calls_classify.go``ClassifyConfig` (Night/Day/Lat/Lng/Timezone fields),
  `ClassifyState` (TimeFilteredCount), `LoadDataFiles` (day/night filter block).
- `cmd/calls_classify.go` — flag parsing, mutual-exclusivity + lat/lng validation,
  config construction, skipped-count summary line, updated usage text.

## [2026-04-18] `calls clip --night`: filter to solar-night recordings only

Adds `--night`, `--lat`, `--lng`, and `--timezone` flags to `skraak calls clip`.
When `--night` is set, each recording is checked against solar sunrise/sunset at
the given coordinates before its audio is loaded — daytime files are skipped
entirely, saving the cost of reading WAV audio for files that would produce no
useful clips.

`--timezone` is not needed for AudioMoth recorders (timestamp comes from the WAV
comment in UTC). It is required for recorders that embed **local time** in the
filename (e.g. DOC AR4) — without it the filename is parsed as UTC and
`solar_night` will be wrong. Pass `--timezone Pacific/Auckland` or the
appropriate IANA zone.

The JSON output gains a `night_skipped` field (omitted when 0) counting how many
files were filtered out. Skipped filenames are logged to stderr.

```bash
skraak calls clip --folder ./data --output ./clips --prefix kiwi \
  --species Kiwi --night --lat -40.85 --lng 172.81

# Non-AudioMoth (DOC AR4, filename in local time):
skraak calls clip --folder ./data --output ./clips --prefix kiwi \
  --species Kiwi --night --lat -40.85 --lng 172.81 --timezone Pacific/Auckland
```

**Files changed:**
- `tools/calls_clip.go``CallsClipInput` (Night/Lat/Lng/Timezone fields),
  `CallsClipOutput` (NightSkipped field), `processFile` night-filter block.
- `cmd/calls_clip.go` — flag parsing, `--night` requires lat/lng validation,
  updated usage/help text.

## [2026-04-18] `calls classify` reviewer, bindings, and display flags moved to config file

**Breaking CLI change.** `skraak calls classify` no longer accepts `--reviewer`,
`--bind`, `--color`, `--sixel`, `--iterm`, or `--img-dims`. These values are now
loaded from `~/.skraak/config.json`.

Rationale: users (e.g. David) were typing the same ~25 `--bind` flags on every
invocation. Moving stable, personal defaults into a config file eliminates that
repetition. Per-invocation flags (`--folder`, `--file`, `--filter`, `--species`,
`--certainty`, `--goto`) stay on the CLI.

Path works cross-platform via `os.UserHomeDir()` — resolves to
`~/.skraak/config.json` on Linux/macOS and `C:\Users\<name>\.skraak\config.json`
on Windows.

Config shape:
```json
{
  "classify": {
    "reviewer": "David",
    "color": true,
    "sixel": false,
    "iterm": false,
    "img_dims": 0,
    "bindings": {
      "k": "Kiwi",
      "1": "Kiwi+Duet",
      "x": "Noise",
      "z": "Don't Know"
    }
  }
}
```

`bindings` values use the same `Species` or `Species+CallType` grammar the old
`--bind key=value` flag accepted — parsing is shared (`cmd/calls_classify.go:parseBind`).

Config-load rejects bindings that collide with keys the TUI reserves for its own
commands (`,` previous segment, `.` next segment, `0` confirm at certainty 100,
space opens the comment dialog). Previously these were silently shadowed by the
TUI hotkey and the user's binding did nothing.

**Files added:**
- `utils/config.go``Config`, `ClassifyFileConfig`, `LoadConfig`, `ConfigPath`.
  Named `LoadConfig` (not `LoadClassifyConfig`) so future subcommands can add
  their own sections to the same file.

**Files changed:**
- `cmd/calls_classify.go` — Removed six flag cases, added config load after arg
  parsing (so `--help` still works without a config), added `--help`/`-h` case,
  added single-character validation on binding keys.

## [2026-04-17] New `skraak isnight` CLI command

Adds a standalone CLI command to check if a WAV file was recorded at night,
without needing a database connection.

```
skraak isnight --file recording.wav --lat -36.85 --lng 174.76
```

Determines the recording timestamp from WAV metadata (AudioMoth comment →
filename pattern → file modification time), then calculates sunrise/sunset
at the given GPS coordinates using the recording midpoint. Returns JSON with
` solar_night`, `civil_night`, `moon_phase`, and sun event times.

Optional `--timezone` flag (default UTC) is used for filename-based timestamps;
AudioMoth comments embed their own timezone. Use `--brief` for batch/agent
use to return only `file_path` and `solar_night` (compact JSON, saves tokens).

**Files added:**
- `tools/isnight.go` — IsNight tool (MCP-free core logic)
- `cmd/isnight.go` — CLI command (flags → tool → JSON output)

**Files changed:**
- `main.go` — Register `isnight` command and usage text

## [2026-04-17] Numpad-friendly keybinds in classify TUI

Two keyboard tweaks to make the TUI easier to drive from the numeric keypad
while labeling kiwi calls:

- **Numpad Enter plays audio.** The Enter-key handler in `tui/classify.go` now
  matches both `tea.KeyEnter` and `tea.KeyKpEnter`, so the keypad's Enter key
  plays the current segment like the main Enter (and still respects Shift for
  half-speed playback). Previously, terminals that disambiguate keypad keys
  (e.g. via Kitty keyboard protocol) delivered numpad Enter as `KeyKpEnter`,
  which fell through the handler and did nothing.
- **Arrow keys navigate segments.** Left arrow now does prev-segment (same as
  `,`) and right arrow does next-segment (same as `.`), so the user can
  navigate without moving their hand off the numpad.

**Files changed:**
- `tui/classify.go` — Enter branch matches `KeyKpEnter`; `,`/`.` switch cases
  also match `"left"`/`"right"`

## [2026-04-05] Simplify calls classify TUI

**Static segment list:** Filtered segments are now computed once at startup and cached.
Reclassifying a segment no longer removes it from the navigation list mid-session.
This fixes instability/crashes when working fast with `--species` or other filters.

**Replace goto dialog with `--goto` flag:**
- Removed ctrl+g goto dialog from TUI (and all supporting code)
- Added `--goto <filename>` CLI flag that opens on the first matching segment in the named file
- Removed `GotoFile()` and `TotalFiles()` methods from `ClassifyState`

**Internal:** Added `NewClassifyState()` constructor for tests. All `getFilteredSegments()` calls
replaced with pre-computed `filteredSegs` cache parallel to `DataFiles`.

**Files changed:**
- `tools/calls_classify.go` — cached segments, `--goto` support, removed dynamic filtering
- `tui/classify.go` — removed goto dialog (model fields, handler, renderer, keybind)
- `cmd/calls_classify.go` — added `--goto` flag parsing
- `tools/calls_classify_*_test.go` — updated to use `NewClassifyState()`

## [2026-04-04] New `prepend` command

Rename WAV files, their .data files, and log.txt by prepending a location prefix.

**Usage:**
```bash
skraak prepend --folder <path> --prefix <string> [--recursive] [--dry-run]
```

**Target files:**
- `*.wav`, `*.WAV` — Only if starting with datestring `YYYYMMDD_HHMMSS`
- `*.wav.data`, `*.WAV.data` — Only if starting with datestring `YYYYMMDD_HHMMSS`
- `log.txt` — Always renamed (exact name match)

**Flags:**
- `--folder <path>` — Target folder (required)
- `--prefix <string>` — String to prepend (required)
- `--recursive` — Include 1 level of subfolders
- `--dry-run` — Show what would be renamed without doing it

**Behavior:**
- Files already starting with `<prefix>_` are skipped with reason "already prefixed"
- WAV files without datestring prefix are skipped with reason "no datestring prefix"
- Non-target files are silently ignored
- Idempotent: running twice is safe

**Examples:**
```bash
# Rename files in a folder
skraak prepend --folder ./recordings --prefix LOC001

# Include subfolders (1 level deep)
skraak prepend --folder ./data --prefix SITE_A --recursive

# Preview changes
skraak prepend --folder ./test --prefix TEST --dry-run
```

**Changes:**
- `tools/prepend.go` — Core logic (datestring detection, file renaming)
- `tools/prepend_test.go` — Unit tests
- `cmd/prepend.go` — CLI command with flag parsing
- `main.go` — Added to command dispatcher

## [2026-04-03] Added `--bookmark` and `--comment` flags to `calls modify`

Allow agents and users to bookmark segments and add comments for information preservation in .data files.

**New flags:**
- `--bookmark` — Mark segment as bookmarked for navigation (boolean flag, sets `bookmark=true`)
- `--comment <text>` — Add user comment (max 140 chars, ASCII only)

**Usage:**
```bash
# Bookmark a segment for later review
skraak calls modify --file recording.data --reviewer GLM-5 \
  --filter mymodel --segment 12-15 --certainty 100 --bookmark

# Add a comment to a segment
skraak calls modify --file recording.data --reviewer GLM-5 \
  --filter mymodel --segment 12-15 --certainty 100 --comment "Good example of duet"
```

**Behavior:**
- `--bookmark` sets `bookmark=true` on the label
- `--comment` stores text in the label's comment field
- Comment validation: max 140 characters, ASCII only
- If all specified values match current values, no modification made (error)

**Changes:**
- `tools/calls_modify.go` — Added `Bookmark` and `Comment` fields to input/output structs, validation logic
- `cmd/calls_modify.go` — Added `--bookmark` and `--comment` flag parsing

## [2026-04-02] New `calls modify` command

Modify a label in a .data file from the command line.

**Usage:**
```bash
skraak calls modify --file recording.data --reviewer GLM-5 \
  --filter mymodel --segment 12-15 --certainty 100 --species Kiwi+Male
```

**Required flags:**
- `--file <path>` — Path to .data file
- `--reviewer <name>` — Reviewer name (always set on file metadata)
- `--filter <name>` — Filter name to match labels
- `--segment <start>-<end>` — Segment time range (integer seconds, e.g., `12-15`)
- `--certainty <int>` — Certainty value (0-100)

**Optional flags:**
- `--species <name>` — Species to set (e.g., `Kiwi`, `Kiwi+Male`, `Noise`)

**Segment matching:**
- Segments matched by `floor(start_time)` and `ceil(end_time)`
- A segment from 12.3s to 14.5s matches `--segment 12-15`

**Behavior:**
- Always updates reviewer on file metadata
- If `--species` provided: sets species and calltype (or clears calltype if not specified)
- If species+calltype AND certainty match current values, no modification made (error)
- Error if no matching segment or label found (no-op on error)

**Use cases:**
- Correct classification: `--certainty 100` only (confirms existing species)
- Incorrect classification: `--species NewSpecies --certainty 100` (changes both)

**Changes:**
- `tools/calls_modify.go` — New file, core logic
- `cmd/calls_modify.go` — New file, CLI parsing
- `cmd/calls.go` — Added `modify` subcommand

## [2026-04-02] Clip feature in `calls classify` TUI

Added `ctrl+s` keybinding to save a clip of the current segment directly from
the classification TUI.

**Keybinding:** `ctrl+s` → type prefix → `enter` to save, `esc` to cancel

**Output files:**
- `<prefix>_<basename>_<start>_<end>.png` — 224x224 color spectrogram (L4 colormap)
- `<prefix>_<basename>_<start>_<end>.wav` — audio clip (16kHz if downsampled)

Files are saved to the current working directory where `skraak` was launched.
Error if files already exist (no overwrite).

**Changes:**
- `tui/classify.go` — Added `clipMode` state, `handleClipKey()`, `renderClipDialog()`,
  and `saveClip()` function; added `ctrl+s` keybinding; updated help line

## [2026-04-02] New `calls clip` command

Generate audio clips and spectrogram images from .data file segments.
Useful for extracting training data or creating datasets for ML.

**Usage:**
```bash
skraak calls clip --file recording.data --output ./clips --prefix train
skraak calls clip --folder ./data --output ./clips --prefix kiwi \
  --filter opensoundscape-kiwi-1.2 --species Kiwi --size 448 --color
```

**Output files:**
- `<prefix>_<basename>_<start>_<end>.png` — spectrogram image (224-896px)
- `<prefix>_<basename>_<start>_<end>.wav` — audio clip (16kHz if downsampled)

 where `basename` is the WAV filename without `.wav` extension.

**Features:**
- Single file (`--file`) or batch folder (`--folder`) processing
- Filter by ML model (`--filter`) and/or species (`--species`)
- Species can include calltype: `Kiwi+Duet`
- `--size <int>` — spectrogram image size (224-896px, default 224)
- `--color` — apply L4 colormap (default: grayscale)
- Error if output files already exist (no overwrite)
- WAV files downsampled to 16kHz if input > 16kHz

**New utilities:**
- `utils.WriteWAVFile(path, samples, sampleRate)` — write mono 16-bit PCM WAV
- `utils.WritePNG(img, writer)` — write image as PNG

**Changes:**
- `utils/wav_writer.go` — New file, WAV writer implementation
- `utils/terminal_image.go` — Added `WritePNG()` function
- `tools/calls_clip.go` — New file, core clip logic
- `cmd/calls_clip.go` — New file, CLI parsing
- `cmd/calls.go` — Added `clip` subcommand

## [2026-04-02] Shared spectrogram generation for show-images and classify

Refactored spectrogram image generation into a shared utility function, reducing
duplication between `calls show-images` and `calls classify` TUI.

**New utility:**
- `utils.GenerateSegmentSpectrogram(dataFilePath, startTime, endTime, color, imgSize)` - 
  generates a spectrogram image from a segment, handling WAV loading, downsampling,
  and image creation in one call.

**Changes:**
- `utils/spectrogram.go` — Added `GenerateSegmentSpectrogram()` function
- `tools/calls_show_images.go` — Now uses `utils.ParseDataFile()` (includes labels) and
  `GenerateSegmentSpectrogram()`; removed local `Segment` struct and `parseDataFile()`;
  segment info now shows labels when present
- `tui/classify.go``generateSpectrogramImage()` now delegates to shared function

**Future:** show-images now has access to segment labels, enabling future filtering
by filter/ml model and species+calltype.

## [2026-03-29] Goto file feature for `calls classify` TUI

Added `ctrl+g` keybinding to jump directly to any file by number. The dialog accepts
a file number (1-based) and jumps to the first segment of that file.

**Keybinding:** `ctrl+g` → type number → `enter` to jump, `esc` to cancel

**Changes:**
- `tools/calls_classify.go` — Added `TotalFiles()` and `GotoFile()` methods to `ClassifyState`
- `tui/classify.go` — Added `gotoMode` and `gotoInput` state; `ctrl+g` keybinding;
  `handleGotoKey()` for digit/backspace/enter/esc handling; `renderGotoDialog()` for UI display

## [2026-03-29] Clarify segment counts in TUI

Updated progress display to explicitly label the segment count.

**Changes:**
- `tui/classify.go` — Changed title line from `file [progress] 1/40826` to `file [progress] 1/40826 Segments`
- `cmd/calls_classify.go` — Updated startup message to clarify filtered counts
- `tools/calls_classify.go` — Added tests to verify filtering behavior
- Confirmed `TotalSegments()` and `CurrentSegmentNumber()` correctly use `getFilteredSegments()`
- Files with no matching segments are pruned during load (existing behavior)

## [2026-03-29] `--species` flag for `calls classify`

Added `--species` flag to scope classification to a single species (and optionally calltype).
Composable with `--filter` for focused review of specific detections within an ML model's output.

**Examples:**
```bash
# Review only Kiwi Duet calls from a specific filter
skraak calls classify --folder ./data --reviewer dave --bind k=Kiwi \
  --filter opensoundscape-kiwi-1.2 --species Kiwi+Duet

# Review all Kiwi calls (any calltype)
skraak calls classify --folder ./data --reviewer dave --bind k=Kiwi --species Kiwi
```

**Changes:**
- `tools/calls_classify.go` — Added `Species` and `CallType` fields to `ClassifyConfig`;
  extended `getFilteredSegments()` with `segmentMatchesFilters()` for AND-composable
  filter+species+calltype matching; prune data files with no matching segments on load
- `cmd/calls_classify.go` — Parse `--species` flag (rejects duplicates), zero-segment
  guard before TUI launch, comprehensive `printClassifyUsage()`

## [2026-03-29] Codebase consistency improvements

**Changes:**
- `tools/import_file.go` — Single DB connection per `ImportFile()` call (was 3), uses
  `validateHierarchyIDs()`, passes `ctx` and `*sql.DB` to helpers
- `tools/import_files.go` — Extracted `validateHierarchyIDs()` for reuse
- `tools/bulk_file_import.go``bulkCreateCluster` uses `db.BeginLoggedTx()` for
  transaction audit logging
- `cmd/common.go` — Extracted `initEventLog()` helper, replacing 14 instances of
  6-line event log boilerplate across 7 cmd files
- `tools/export.go` — Documented why `fmt.Sprintf` for table names is safe (hardcoded manifest)
- `tools/location.go` — Fixed `Exec``ExecContext` for context propagation consistency
- `utils/cluster_import.go` — Exported `LocationData` and `GetLocationData` for cross-package use
- Removed duplicate godoc comments on several tool functions

## [2026-03-19] NOT NULL Constraint Validation in Bulk Import

Added empty-string validation for CSV fields in `bulkReadCSV()` (`tools/bulk_file_import.go`).

Audited all INSERT/UPDATE paths for NOT NULL constraint enforcement. Found one gap:
`record[3]` (DateRange → cluster name) was not validated for empty strings. Also added
validation for `record[0]` (location_name) and `record[2]` (directory_path) which would
cause downstream failures if empty.

**Changes:**
- `tools/bulk_file_import.go` — validate `location_name`, `directory_path`, and `date_range`
  CSV fields are non-empty (with TrimSpace) before building `bulkLocationData` structs

## [2026-03-14] Remove import_ml_selections (Deprecated)

**Breaking Change:** Removed deprecated `import selections` CLI command and `import_ml_selections` MCP tool.

The `import segments` command is the replacement, offering:
- AviaNZ .data file import (industry standard)
- Species/calltype mapping file validation
- Transactional imports with proper error handling
- Simpler, more maintainable codebase

**Removed:**
- `tools/import_ml_selections.go` (1134 lines)
- `cmd/mcp.go``import_ml_selections` MCP tool registration
- `cmd/import.go``selections` CLI subcommand

**Changes:**
- `utils/mapping.go` — Exported `Placeholders()` function for reuse

## [2026-03-14] Import Segments - Fix Orphaned Segments

**Fix:** Segments with no valid labels are now deleted from the database.

When a segment's labels all fail validation (e.g., missing species in mapping), the segment
was previously left orphaned in the database with no labels. Now the segment is deleted within
the same transaction, maintaining data integrity.

**Changes:**
- `tools/import_segments.go` — Delete orphaned segments when all labels fail validation
- `utils/mapping_test.go` — Unit tests for mapping file loading and validation
- `tools/import_segments_test.go` — Unit tests for input validation and segment counting
- `utils/data_file_test.go` — Added tests for skraak_hash and skraak_label_id round-trip

## [2026-03-14] Import Segments Command

**Feature:** New `skraak import segments` command to import AviaNZ .data segments into the database.

**Changes:**
- `utils/mapping.go` — New utilities for loading and validating species/calltype mapping files
- `tools/import_segments.go` — New tool with `ImportSegments()` function
- `cmd/import.go` — Added `segments` subcommand

**Usage:**
```bash
skraak import segments \
  --db ./db/skraak.duckdb \
  --dataset gljgxDbfasva \
  --location ZEVWGbXzB1bl \
  --cluster q7w-iQgyZOYV \
  --folder /path/to/data \
  --mapping mapping.json
```

**Mapping file format** (`mapping.json`):
```json
{
  "Don't Know": {
    "species": "Don't Know"
  },
  "GSK": {
    "species": "Roroa",
    "calltypes": {
      "Male": "Male - Solo",
      "Female": "Female - Solo"
    }
  }
}
```

**Output structure:**
```json
{
  "summary": {
    "data_files_found": 42,
    "data_files_processed": 42,
    "total_segments": 342,
    "imported_segments": 342,
    "imported_labels": 356,
    "imported_subtypes": 280,
    "processing_time_ms": 1234
  },
  "segments": [...],
  "errors": []
}
```

**Invariants enforced:**
- All file hashes must already exist in database for the cluster
- All files must have no existing labels (fresh imports only)
- All filters, species, and calltypes must exist in database
- Segments with `bookmark: true` labels are skipped
- Mapping must cover all species found in .data files

**Database writes:**
- `segment` table: id, file_id, dataset_id, start_time, end_time, freq_low, freq_high
- `label` table: id, segment_id, species_id, filter_id, certainty
- `label_metadata` table: `{"comment": "..."}` (only if comment present)
- `label_subtype` table: id, label_id, calltype_id, filter_id, certainty (if calltype present)

**Data file updates:**
- `skraak_hash` written to metadata section (first element of .data array)
- `skraak_label_id` written to each label object

**Rationale:**
AviaNZ .data files contain segment annotations from both manual review and ML filters. This command imports those segments into the skraak database with proper species/calltype mapping, enabling integrated analysis across all annotation sources.

## [2026-03-13] Calls Summarise Command

**Feature:** New `skraak calls summarise` command to analyse .data files after classification.

**Changes:**
- `tools/calls_summarise.go` — New tool with `CallsSummarise()` function
- `cmd/calls.go` — Added `summarise` subcommand

**Usage:**
```bash
skraak calls summarise --folder ./recordings > summary.json
skraak calls summarise --folder ./recordings | jq 'del(.segments)'  # summary only
```

**Output structure:**
```json
{
  "segments": [...],
  "data_files_read": 27,
  "data_files_skipped": [],
  "total_segments": 47,
  "filters": {
    "opensoundscape-kiwi-1.2": {
      "segments": 20,
      "species": {"Kiwi": 15, "Don't Know": 5},
      "calltypes": {"Kiwi": {"Male": 10, "Duet": 5}}
    }
  },
  "review_status": {
    "unreviewed": 30,
    "confirmed": 10,
    "dont_know": 5,
    "with_calltype": 8,
    "with_comments": 3,
    "bookmarked": 2
  },
  "operators": ["Auto"],
  "reviewers": ["David", "None"]
}
```

**Review status definitions:**
- `unreviewed`: certainty < 100 (default from detection)
- `confirmed`: certainty = 100 (user pressed bind key)
- `dont_know`: certainty = 0

**Calltypes:** Only appears in filters when species have calltypes set, showing per-species calltype counts.

**Rationale:**
After running `skraak classify` on .data files, it's difficult to understand the state of classifications. This command provides a comprehensive summary with both detailed segments array and aggregated statistics.

## [2026-03-10] Spectrogram Sample Rate Limiting

**Feature:** Spectrograms now automatically downsample high sample rate audio to 16kHz.

**Changes:**
- `utils/spectrogram.go` — Added `DefaultMaxSampleRate = 16000` constant
- `utils/resample.go` — Added `ResampleRate()` function for sample rate conversion
- `tools/calls_show_images.go` — Downsample segments before spectrogram generation
- `tui/classify.go` — Downsample segments before spectrogram generation

**Rationale:**
- High sample rates (e.g., 250kHz bat detectors) produce very tall spectrograms
- Birds are typically in 0-8kHz range; 16kHz sample rate (Nyquist = 8kHz) is sufficient
- Audio playback unchanged — plays at original sample rate

**Behavior:**
| Original Rate | Spectrogram Rate | Playback Rate |
|---------------|------------------|---------------|
| 8000 Hz | 8000 Hz | 8000 Hz |
| 16000 Hz | 16000 Hz | 16000 Hz |
| 44100 Hz | 16000 Hz | 44100 Hz |
| 250000 Hz | 16000 Hz | 250000 Hz |

## [2026-03-09] Case-Preserving WAV File Finding

**Fix:** WAV files with lowercase `.wav` extension now produce correct `.wav.data` files.

**Changes:**
- `tools/calls_from_preds.go` — Added `findWAVFile()` helper function
- `tools/calls_from_birda.go` — Updated to use `findWAVFile()`
- `tools/calls_from_raven.go` — Updated to use `findWAVFile()`

**Problem:** Previous code hardcoded `.WAV` extension, causing issues on case-sensitive filesystems:
- `abc.wav` would fail to be found
- Or produce `abc.WAV.data` instead of `abc.wav.data`

**Solution:** `findWAVFile(dir, baseName)` searches for:
1. `.WAV` (most common for main recordings)
2. `.wav` (common for clips)
3. `.Wav` (edge case)
4. Case-insensitive glob fallback

**Result:**
| WAV File | .data File |
|----------|------------|
| `abc.WAV` | `abc.WAV.data` |
| `abc.wav` | `abc.wav.data` |
| `abc.Wav` | `abc.Wav.data` |

## [2026-03-09] Bookmark Navigation in TUI

**New feature:** Bookmark segments for later review.

**Changes:**
- `utils/data_file.go` — Added `Bookmark bool` to Label struct
- `tools/calls_classify.go` — Added bookmark methods
- `tui/classify.go` — Added key handlers and display
- `tui/classify.go` — Header lines now wrap at 80 characters

**Format** (stored in label):
```json
[0, 3, 0, 16000, [{"species": "Kiwi", "certainty": 90, "filter": "BirdNET", "bookmark": true}]]
```

**Key bindings:**
| Key | Action |
|-----|--------|
| `Ctrl+D` | Toggle bookmark on current segment |
| `Ctrl+,` | Previous bookmark (wraps around) |
| `Ctrl+.` | Next bookmark (wraps around) |

**Behavior:**
- Bookmark lives on the filter-matching label
- `--filter BirdNET` shows bookmarks on BirdNET labels only
- No filter shows all bookmarks
- Wrap-around navigation with loop detection
- `[BOOKMARKED]` indicator shown in segment info

## [2026-03-09] Comment Dialog Editing in TUI

**Enhancement:** Full cursor editing support in the comment dialog.

**Changes:**
- `tui/classify.go` — Added cursor position tracking and navigation

**New features:**
| Key | Action |
|-----|--------|
| `` / `` | Move cursor left/right |
| `Space` | Insert space at cursor |
| `Backspace` | Delete character before cursor |
| `Delete` | Delete character at cursor |
| `Ctrl+A` | Move cursor to start |
| `Ctrl+E` | Move cursor to end |

**Fixed:**
- Space bar now works in comment dialog
- Backspace deletes at cursor position, not just at end

## [2026-03-09] New Commands: calls from-birda and calls from-raven

**New feature:** Import BirdNET and Raven annotation files to .data files.

**Added:**
- `tools/calls_from_birda.go` — BirdNET results file parser
- `tools/calls_from_raven.go` — Raven selections file parser
- `cmd/calls.go` — New subcommands `from-birda` and `from-raven`
- `tools/calls_from_birda_raven_test.go` — 10 test cases

**Commands:**
```bash
# BirdNET (filter always "BirdNET")
./skraak calls from-birda --folder /path/to/recordings
./skraak calls from-birda --file recording.BirdNET.results.csv [--delete]

# Raven (filter always "Raven")
./skraak calls from-raven --folder /path/to/recordings
./skraak calls from-raven --file recording.Table.1.selections.txt [--delete]
```

**File formats:**
- BirdNET: `*.BirdNET.results.csv` (CSV with BOM, columns: Start, End, Scientific name, Common name, Confidence, File)
- Raven: `*.selections.txt` (Tab-separated, columns: Begin Time, End Time, Low Freq, High Freq, Species)

**Behavior (same as from-preds):**
- Filter is always parsed from filename (no `--filter` option)
- No clobber: if filter already exists, error
- Merge: if different filter exists, append segments
- Confidence (BirdNET) converted from 0.0-1.0 to 0-100
- Frequency range preserved from Raven selections
- `--delete` option removes source files after successful import

**Tests:** 10 new tests covering:
- New .data file creation
- Same filter rejection (no clobber)
- Different filter merge
- Delete option
- Folder mode (BirdNET only)
- Multiple selections (Raven only)

## [2026-03-09] Safe .data File Writing in calls-from-preds

**Breaking change:** Filter must now be non-empty. Previously empty filter was allowed.

**Problem:** `calls-from-preds --write-dot-data` would silently clobber existing `.data` files, potentially destroying manual annotations.

**Solution:** Implemented safe write logic that protects existing data:

1. **No existing file** → Write new file (unchanged behavior)
2. **Existing file, same filter** → Error: "file already contains filter 'X' (refusing to clobber)"
3. **Existing file, different filter** → Merge segments (append new, sort by time)
4. **Existing file, parse error** → Error: "cannot parse existing file (refusing to clobber)"

**Changes:**
- `tools/calls_from_preds.go` — Added `writeDotDataFileSafe()` for safe write/merge logic
- `tools/calls_from_preds.go` — Added filter validation: empty filter now returns error
- `tools/calls_from_preds.go` — Filter defaults to CSV filename parsing if `--filter` not specified
- `tools/calls_from_preds.go` — Added `convertAviaNZSegment()` and `buildAviaNZMetaAndSegments()` helpers

**Filter logic:**
- If `--filter "name"` specified → use that filter
- If `--filter` not specified → parse from CSV filename (e.g., `predsST_opensoundscape-kiwi-1.2_2025-11-12.csv``opensoundscape-kiwi-1.2`)
- If filter is empty string → error

**Error handling:** First error stops batch processing (existing behavior preserved).

**Tests added:** `tools/calls_from_preds_test.go` with 7 test cases:
- Empty filter returns error
- New .data file created when none exists
- Existing file with same filter returns error (refuses to clobber)
- Existing file with different filter merges segments
- Existing file with parse error returns error (refuses to clobber)
- Explicit filter via `--filter` flag
- Non-parsable filename without filter returns error

## [2026-03-07] JSON Schema for AviaNZ .data Files

**New feature:** Added JSON Schema (Draft 2020-12) for validating AviaNZ .data annotation files.

**Added:**
- `db/avianz_data_schema.json` — Comprehensive schema for .data file format

**Schema coverage:**
- Root array with metadata object first, then segment arrays
- Meta object with `Operator`, `Reviewer`, `Duration` (optional, allows extra fields)
- Segment array: 5-element tuple `[starttime, endtime, freq_low, freq_high, labels]`
- Label object with required `species` and `certainty` (0-100)
- Optional fields: `filter`, `calltype`, `comment` (max 140 chars)
- Additional properties allowed on all objects (extensibility)
- Pattern constraint: `species` must not contain `>` separator

**Validation tests:**
- Missing required fields caught
- Certainty range (0-100) enforced
- Comment length (max 140) enforced
- Minimal valid files accepted

## [2026-03-07] Comment Feature in Classify TUI

**New feature:** Press spacebar in the classify TUI to add/edit comments on labels.

**Changes:**
- `utils/data_file.go` — Added `Comment` field to `Label` struct, parse/write handling
- `tools/calls_classify.go` — Added `SetComment()` and `GetCurrentComment()` methods, `Comment` field in `BindingResult`
- `tui/classify.go` — Added `commentMode`/`commentText` state, spacebar opens dialog, text input handling, dialog rendering

**AviaNZ spec compliance:** The spec allows "any additional attributes defined for this call" as key-value pairs. Comments are stored as `"comment": "text"` in the label object.

**Usage:**
- `[space]` — Open comment dialog (pre-fills existing comment)
- Type comment (max 140 chars, ASCII only)
- `[enter]` — Save comment
- `[esc]` — Cancel (discard changes)
- `[backspace]` — Delete last character
- `[ctrl+u]` — Clear all

**Help text:** `[esc]quit [,]prev [.]next [space]comment [enter]play [shift+enter]½speed`

## [2026-03-04] Half-Speed Audio Playback in Classify TUI

**New feature:** Press Shift+Enter in the classify TUI to play audio at half speed.

**Changes:**
- `utils/resample.go`**NEW** Linear interpolation resampling for speed changes
- `utils/audio_player.go` — Added `PlayAtSpeed(samples, sampleRate, speed)` method
- `tools/calls_classify.go` — Added `PlaybackSpeed` field to `ClassifyState`
- `tui/classify.go` — Detect Shift+Enter modifier, display "▶ Playing 0.5x..." in status
- `tui/classify.go` — Changed quit key from `q` to `Escape` (frees `q` for bindings)

**Usage:** `[esc]quit  [enter]play  [shift+enter]½speed`

## [2026-03-04] Performance Optimizations for calls-from-preds

**Problem:** Processing 7617 WAV files took 16 minutes due to excessive I/O and sequential processing.

**Changes:**
- `utils/wav_metadata.go` — Added `ParseWAVHeaderMinimal()` that reads only 4KB instead of 200KB per file (50× less I/O). Added separate buffer pool for minimal headers.
- `tools/calls_from_preds.go` — Added parallel processing with 8 workers for .data file generation. Small batches (<10 files) use sequential processing to avoid goroutine overhead.
- `tools/calls_from_preds.go` — Added `ProgressHandler` callback type for progress reporting during long operations.
- `cmd/calls.go` — Added progress indicator showing "Processing WAV files: X/Y (Z%)" during .data file writing.

**Expected improvement:** ~8× faster on multi-core systems due to parallel processing + reduced I/O overhead.

## [2026-03-04] Add iTerm2 Inline Image Protocol Support

**New feature:** Added `--iterm` flag for terminals supporting the iTerm2 Inline Image Protocol (WezTerm, iTerm2, VS Code terminal).

- `utils/terminal_image.go` — Added `ProtocolITerm` enum value and `WriteITermImage()` using charm's `x/ansi/iterm2` package; PNG-encodes then base64-encodes for the iTerm2 escape sequence
- `tools/calls_show_images.go` — Added `ITerm` field to `CallsShowImagesInput`, checked before `Sixel` in protocol selection
- `tools/calls_classify.go` — Added `ITerm` field to `ClassifyConfig`
- `cmd/calls.go` — Added `--iterm` flag to `show-images` subcommand
- `cmd/calls_classify.go` — Added `--iterm` flag to `classify` subcommand
- `tui/classify.go` — Renamed `sixelImageCmd` to `inlineImageCmd` with protocol parameter; changed conditionals from `== ProtocolSixel` to `!= ProtocolKitty` so both sixel and iTerm2 use the same inline rendering path
- `utils/terminal_image_test.go` — Tests for `WriteITermImage`, `WriteImage` routing, and `ClearImages` no-op

## [2026-02-28] Fix Kitty Image Rendering at 448px in Classify TUI

**Bug fix:** Spectrogram display upgraded from 224x224 to 448x448 pixels. Old image artifacts persisted between segment navigations at the larger size.

- `utils/kitty_image.go` — Chunked Kitty protocol transmission (4096-byte chunks) per spec; small images still sent as single payload
- `tui/classify.go` — Return `tea.ClearScreen` on navigation keys (`,`, `.`, bindings) to force full redraw and reliable image clearing
- `tui/classify.go``ResizeImage` call updated from 224x224 to 448x448
- `utils/kitty_image_test.go` — Tests for single-chunk, multi-chunk, and clear behavior

## [2026-02-28] Audio Playback in Classify TUI

**New feature:** Press Enter to play the current segment's audio during classification.

- Added `utils/audio_player.go` — wraps ebitengine/oto v3 for PCM playback
- Oto context created lazily on first play, reused across segments
- Converts `[]float64` samples → signed int16 LE for oto
- Playback stops automatically on navigation (`,`/`.`), binding keys, and quit
- "▶ Playing..." indicator shown in segment info line
- New dependency: `github.com/ebitengine/oto/v3` (requires `libasound2-dev` on Linux)

## [2026-02-22] New CLI Command: calls-from-preds

**New feature:** Extract clustered bird calls from ML predictions CSV files.

**Usage:**
```bash
./skraak calls-from-preds --csv predictions.csv > calls.json
```

**How it works:**
1. Reads prediction CSV (file, start_time, end_time, ebird_code columns with 1/0 values)
2. Auto-detects clip duration from first row
3. Groups detections by (file, ebird_code) and sorts by start_time
4. Clusters consecutive detections where gap ≤ 3 × clip_duration
5. Filters out single detections (configurable via constant)

**Constants (easily changeable):**
```go
CLUSTER_GAP_MULTIPLIER     = 3  // Gap threshold = 3 × clip_duration
MIN_DETECTIONS_PER_CLUSTER = 1  // Filter single detections
```

**Performance:** 400k+ rows processed in ~0.67 seconds

**Output example:**
```json
{
  "calls": [
    {"file": "path.WAV", "start_time": 0, "end_time": 32, "ebird_code": "tomtit1", "detections": 11}
  ],
  "total_calls": 62593,
  "species_count": {"tomtit1": 12636, ...},
  "files_count": 14017
}
```

**Files:**
- `tools/calls_from_preds.go` — Core clustering logic
- `cmd/calls_from_preds.go` — CLI handler

---

## [2026-02-21] Remove import_audio_file MCP Tool

**Breaking change:** Removed `import_audio_file` MCP tool. Use CLI command `skraak import file` for single file imports.

**Rationale:** The MCP tool was redundant since:
1. Single file imports are better suited for CLI use (requires file path on local machine)
2. `import_audio_files` handles batch imports efficiently via MCP
3. Reduces MCP tool count from 11 to 10

**Changes:**
- **`cmd/mcp.go`** — Removed `import_audio_file` tool registration and adapter
- **`tools/import_file.go`** — Kept for CLI use only
- **`cmd/import.go`** — CLI command `skraak import file` unchanged

**Migration:** Use CLI command instead:
```bash
./skraak import file --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --path /path/to/file.wav
```

---

## [2026-02-21] Verb-First CLI Commands

**Breaking change:** Replaced resource-first CLI commands with natural language verb-first structure.

**Before:**
```bash
./skraak dataset create --name "Test"
./skraak location update --id abc123 --name "Updated"
```

**After:**
```bash
./skraak create dataset --name "Test"
./skraak update location --id abc123 --name "Updated"
```

**Changes:**
- **`main.go`** — Removed legacy `dataset`, `location`, `cluster`, `pattern` commands
- **`cmd/create.go`** — New verb-first create handler
- **`cmd/update.go`** — New verb-first update handler  
- **`cmd/dataset.go`, `cmd/location.go`, `cmd/cluster.go`, `cmd/pattern.go`** — Exported create/update functions
- **Shell scripts** — Updated `test_bulk_import.sh` and `test_event_log.sh` to use new syntax

**Benefits:**
- Natural language flow: "create dataset" vs "dataset create"
- Consistent with `skraak import file/folder/bulk` pattern
- More intuitive for users
- Maintains clean tool separation in `@tools/` directory

**Migration:** Legacy commands now return "Unknown command" error, forcing adoption of new syntax.

---

## [2026-02-21] Fix Event Log Pointer Serialization

**Bug fix:** Event log contained pointer addresses instead of values for nullable database fields (`*float64`, `*GainLevel`, etc.), causing replay failures.

**Root cause:** `marshalParam()` in `db/tx_logger.go` didn't handle pointer types for numeric values or named type aliases (like `db.GainLevel`). These fell through to `fmt.Sprintf("%v", pointer)` which printed memory addresses like `"0x38a7bfb12078"`.

**Example of corrupted data:**
```json
"parameters": ["file_id", "2025-05-18T18:30:00+13:00", "248AB50053AB1B4A", "0x38a7bfb12078", "0x38a7bfb12088", "0x38a7bfb12090"]
```
The last three values should have been `gain`, `battery_v`, `temp_c` but were pointer addresses.

**Fixed:**
- `db/tx_logger.go` — Added explicit cases for all pointer types (`*int`, `*int64`, `*float64`, `*bool`, etc.)
- `db/tx_logger.go` — Added reflection-based fallback in default case to handle pointer-to-named-type (e.g., `*GainLevel`)
- `cmd/replay.go` — Increased `bufio.Scanner` buffer from 64KB to 20MB to handle large event lines (17,000 files = ~16 MB JSON line)

**Tests added:**
- `db/tx_logger_test.go` — Tests for `*int`, `*int64`, `*float64`, `*float32`, `*bool` with nil and value cases
- `db/tx_logger_test.go` — Tests for named type aliases and pointer-to-named-type

---

## [2026-02-19] Fix Update Commands - Preserve Unset Fields

**Bug fix:** Update commands were overwriting existing values with empty strings when optional flags weren't provided.

**Root cause:** CLI code set pointers to empty strings even when flags weren't provided, causing tools layer to interpret them as intentional empty values.

**Fixed:**
- `cmd/dataset.go``runDatasetUpdate()` now only sets pointer fields when flags have non-empty values
- `cmd/location.go``runLocationUpdate()` now only sets pointer fields when flags have non-empty values
- `cmd/cluster.go` — Already correct (only sets fields when provided)
- `cmd/pattern.go` — Already correct (only sets fields when provided)

**Tests added:**
- `tools/update_test.go` — Unit tests verifying update preserves unset fields for all entity types

---

## [2026-02-19] Schema Simplification - Remove species_dataset and ebird_taxonomy_v2024

**Database schema changes:**
- Dropped `species_dataset` table — all species now available across all datasets
- Dropped `ebird_taxonomy_v2024` table — use `WHERE taxonomy_version = '2024'` on `ebird_taxonomy` instead

**Rationale:**
- Simplifies species management (no duplicate species names across datasets)
- Reduces schema complexity (one fewer join for species lookups)
- `ebird_taxonomy_v2024` was redundant; filtering `ebird_taxonomy` directly is sufficient

**Code changes:**
- `tools/export.go` — Simplified manifest: `species` and `call_type` now "copy" (full table)
- `tools/export.go` — Removed `buildDerivedTableCreate()`, `populateDerivedTable()`, simplified `buildReferencedQuery()`
- `tools/import_ml_selections.go` — Species lookup no longer joins `species_dataset`
- `resources/schema.go` — Removed tables from list
- `db/schema_test.go` — Removed obsolete test cases
- `prompts/examples.go` — Updated taxonomy schema description

**Export manifest changes:**
- `species_dataset` → removed (no longer exists)
- `ebird_taxonomy_v2024` → removed (no longer exists)
- `species` → changed from "referenced" to "copy"
- `call_type` → changed from "referenced" to "copy"
- `filter` → changed from "referenced" to "copy"
- All "referenced" and "derived" handling code removed

---

## [2026-02-19] Dataset Export for Collaboration and Testing

**New feature: Export a dataset with all related data to a new database**

**Purpose:** Enable dataset-level exports for collaboration (export, modify, replay changes), testing (small focused test DBs), and archival.

**Architecture:**
- Schema read from embedded `db/schema.sql` (DDL statements extracted dynamically)
- Table copy order computed from FK relationships using `duckdb_constraints()`
- ATTACH mechanism for efficient cross-database copying
- Declarative manifest defines table relationships

**Added:**
- `tools/export.go``ExportDataset()` with table manifest and copy logic
- `cmd/export.go``skraak export dataset` CLI command
- `db/schema.go` — Schema utilities: `ReadSchemaSQL()`, `ExtractDDLStatements()`, `GetFKOrder()`
- `shell_scripts/test_export.sh` — Integration test script

**Command:**
```bash
skraak export dataset --db skraak.duckdb --id abc123 --output export.duckdb
skraak export dataset --db skraak.duckdb --id abc123 --output export.duckdb --dry-run
skraak export dataset --db skraak.duckdb --id abc123 --output export.duckdb --force
```

**What's exported:**
- Dataset row and all owned data (locations, clusters, files, selections, labels)
- Reference tables copied in full (`ebird_taxonomy`, `species`, `call_type`, `cyclic_recording_pattern`, `filter`)
- Empty event log created for capturing changes

**Design decisions:**
- Schema from `schema.sql` ensures schema-resilience (new columns auto-included)
- FK order computed dynamically via `duckdb_constraints()` function
- Close source DB before output DB (DuckDB single-connection limit)
- `SELECT *` copies all columns without hard-coding

**Testing:**
- `db/schema_test.go` — Unit tests for DDL extraction and FK ordering
- Integration tests verify row counts match source
- Error handling tests for missing dataset, existing file

---

## [2026-02-18] Event Log for Database Mutation Replay

**New feature: SQL-level event logging for backup synchronization**

**Purpose:** Capture all mutating SQL operations (INSERT, UPDATE, DELETE) to enable replay on backup databases for synchronization.

**Architecture:**
- Transaction wrapper (`db.LoggedTx`) intercepts all mutations
- Logged only on successful commit (rollback discards recorded queries)
- Events written to JSONL file (`<database>.events.jsonl`)
- Prepared statements fully supported via `LoggedStmt` wrapper

**Added:**
- `db/tx_logger.go` — LoggedTx, LoggedStmt, TransactionEvent types
- `cmd/replay.go``skraak replay events` CLI command
- `shell_scripts/test_event_log.sh` — Integration test script

**Modified:**
- All CLI commands initialize event log with defer close
- All tools use `db.BeginLoggedTx()` instead of `database.BeginTx()`
- `utils/cluster_import.go` updated for batch imports

**Event format (JSONL):**
```json
{
  "id": "V1StGXR8_Z5jdHi6B-myT",
  "timestamp": "2026-02-18T14:30:22+13:00",
  "tool": "create_or_update_dataset",
  "queries": [
    {"sql": "INSERT INTO ...", "parameters": [...]}
  ],
  "success": true,
  "duration_ms": 45
}
```

**Replay command:**
```bash
skraak replay events --db backup.duckdb --log skraak.duckdb.events.jsonl
skraak replay events --db backup.duckdb --log events.jsonl --dry-run
skraak replay events --db backup.duckdb --log events.jsonl --last 10
```

**Key design decisions:**
- SQL-level (not tool-level) for complete fidelity including imports
- Tool name included for context/debugging
- Only successful transactions logged
- Failed events skipped during replay
- `--continue` flag to proceed past errors

**Testing:**
- `db/tx_logger_test.go` — 123 unit tests, 75.9% coverage
- Pure function tests (isMutation, marshalParam, JSON marshaling)
- Integration tests with real DuckDB and file system
- Race detector verified

---

## [2026-02-11] CLI Refactoring — Two-Layer Architecture

**Major refactoring: Separated core logic from MCP types, added CLI commands**

**Problem:** All tool functions were tightly coupled to MCP SDK types (`*mcp.CallToolRequest`, `*mcp.CallToolResult`). This meant functionality could only be invoked via MCP protocol — no CLI access for power users.

**Solution:** Two-layer architecture separating core logic from MCP adapters.

**Created:**
- `cmd/mcp.go` — MCP server setup + 10 thin adapter wrappers (~3 lines each)
- `cmd/import.go``skraak import bulk` CLI command with flag parsing
- `cmd/sql.go``skraak sql` CLI command for ad-hoc queries

**Modified (mechanical, all tools/):**
- Removed `*mcp.CallToolRequest` parameter (was never used — `req` always ignored)
- Removed `*mcp.CallToolResult` from returns (was always empty `&mcp.CallToolResult{}`)
- Removed `import "github.com/modelcontextprotocol/go-sdk/mcp"` from all tool files
- Updated test files (`integration_test.go`, `pattern_test.go`) to match new signatures
- Updated `main.go` to pure dispatcher: `mcp | import | sql`

**Architecture:**
```
main.go              → pure dispatcher
cmd/mcp.go           → MCP server + adapter wrappers (ONLY file importing mcp SDK)
cmd/import.go        → CLI: skraak import bulk --db ... --dataset ... --csv ... --log ...
cmd/sql.go           → CLI: skraak sql --db ... "SELECT ..."
tools/*.go           → core logic, NO mcp dependency (plain Go structs in/out)
utils/, db/, etc.    → unchanged
```

**Benefits:**
- CLI access for power users without MCP
- Token savings (CLI avoids MCP protocol overhead)
- Code sharing between CLI and MCP
- MCP SDK contained to one file
- All tests pass

---

## [2026-02-10] Bulk File Import Cluster Assignment Bug Fix

**Critical Bug Fix: Files now correctly distributed across multiple clusters for same location**

**Problem:** When the same location appeared multiple times in the CSV with different date ranges, all files ended up in the last cluster created instead of being distributed across their respective clusters.

**Root Cause:** The `clusterIDMap` used only `LocationID` as the key, causing each new cluster for the same location to overwrite the previous one in the map.

**Solution:** Changed map key from `LocationID` to composite key `LocationID|DateRange`.

**Modified:**
- `tools/bulk_file_import.go` (lines 125, 171-172, 183-184)

**Impact:**
- Data integrity restored
- Multiple date ranges per location now works correctly
- Simple 3-line fix, backwards compatible

---

## [2026-02-07] File Modification Time Fallback

**Enhancement: Added file modification time as third timestamp fallback**

**Problem:** Small clusters (1-2 files) failed variance-based filename disambiguation because the algorithm needs multiple samples to determine date format (YYYYMMDD vs YYMMDD vs DDMMYY).

**Timestamp Resolution Order:**
```
1. AudioMoth comment → timestamp
2. Filename parsing → timestamp
3. File modification time → timestamp (NEW!)
4. FAIL (skip file with error)
```

**Modified:**
- `utils/cluster_import.go` - Added FileModTime fallback in `batchProcessFiles()`

**Benefits:**
- Fewer failures in small clusters
- No performance impact
- Backwards compatible
- Simple 10-line change

---

## [2026-02-07] Cluster Import Logic Extraction

**Major refactoring: Extracted shared cluster import logic into utils module**

**Key Insight:** A cluster is the atomic unit of import (one SD card / one recording session / one folder).

**Created:**
- `utils/cluster_import.go` (553 lines) - Single source of truth for cluster imports
  - `ImportCluster()` - Main entry point
  - `scanClusterFiles()` - Recursive WAV file scanning
  - `batchProcessFiles()` - Batch processing with variance-based parsing
  - `insertClusterFiles()` - Transactional insertion

**Modified:**
- `tools/import_files.go` - 75% code reduction (650 lines → 161 lines)
- `tools/bulk_file_import.go` - Bug fixes:
  - **CRITICAL BUG FIXED:** Now inserts into `file_dataset` table (was missing!)
  - **CRITICAL BUG FIXED:** Now inserts into `moth_metadata` table (was missing!)

**Benefits:**
- Bug fixed: 68,043 orphaned files found in test database
- ~500 lines of duplicated code eliminated
- Single source of truth for all import logic

---

## [2026-02-06] Tool Consolidation

**Consolidated 8 write/update tools → 4 create_or_update tools**

**Deleted:**
- 8 separate create/update tool files

**Added:**
- `tools/dataset.go` - `create_or_update_dataset`
- `tools/location.go` - `create_or_update_location`
- `tools/cluster.go` - `create_or_update_cluster`
- `tools/pattern.go` - `create_or_update_pattern`

**Design:**
- Omit `id` field → CREATE mode (generates nanoid)
- Provide `id` field → UPDATE mode (verifies exists)

**Benefits:**
- Tool count: 14 → 10
- ~31% less code (~320 lines removed)
- Shared validation logic

---

## [2026-02-06] Test Script Consolidation

**Rationalized and consolidated shell test scripts**

**Removed redundant scripts:**
- 6 incomplete/redundant test scripts

**Current test suite (8 scripts):**
1. `get_time.sh` - Time tool
2. `test_sql.sh` - SQL query tool
3. `test_tools.sh` - All create_or_update tools
4. `test_import_file.sh` - Single file import
5. `test_import_selections.sh` - ML selection import
6. `test_bulk_import.sh` - Bulk CSV import
7. `test_resources_prompts.sh` - Resources/prompts
8. `test_all_prompts.sh` - All 6 prompts

---

## [2026-02-06] Bulk File Import Tool

**New Feature: CSV-based bulk import across multiple locations and clusters**

**Added:**
- `tools/bulk_file_import.go` - CSV-based bulk import (~500 lines)

**Features:**
- CSV-driven import for multiple locations
- Auto-cluster creation
- Progress logging to file
- Summary statistics

**CSV Format:**
```csv
location_name,location_id,directory_path,date_range,sample_rate,file_count
Site A,loc123456789,/path/to/recordings,2024-01,48000,150
```

---

## [2026-02-02] Single File Import Tool

**New Feature: Import individual WAV files**

**Added:**
- `tools/import_file.go` - Single file import implementation (~300 lines)

**Features:**
- Import one WAV file at a time with detailed feedback
- Same processing pipeline as batch import
- Duplicate detection with `is_duplicate` flag
- Atomic operation (succeeds completely or fails)

---

## [2026-01-29] ML Selection Import Tool

**New Feature: Import ML-detected kiwi call selections from folder structure**

**Added:**
- `utils/selection_parser.go` - Selection parsing utilities
- `utils/selection_parser_test.go` - 34 test cases
- `tools/import_ml_selections.go` - MCP tool (~1050 lines)

**Features:**
- Folder structure: `Clips_{filter_name}_{date}/Species/CallType/*.wav+.png`
- Two-pass file matching (exact, then fuzzy)
- Comprehensive validation
- Transactional import

---

## [2026-01-28] Comprehensive Go Unit Testing

**Added comprehensive unit test suite**

**Added:**
- `utils/astronomical_test.go` - 11 test cases
- `utils/audiomoth_parser_test.go` - 36 test cases
- `utils/filename_parser_test.go` - 60 test cases
- `utils/wav_metadata_test.go` - 22 test cases
- `utils/xxh64_test.go` - 6 test cases

**Coverage:**
- 170+ tests total
- 91.5% code coverage

---

## [2026-01-26] Generic SQL Tool + Codebase Rationalization

**Major architectural change: Replaced 6 specialized tools with generic SQL**

**Deleted:**
- 6 specialized query tools (datasets, locations, clusters, files)
- 2 obsolete test scripts

**Added:**
- `tools/sql.go` - Generic `execute_sql` tool (~200 lines)
- `shell_scripts/test_sql.sh` - Comprehensive SQL test suite

**Modified:**
- `prompts/examples.go` - Rewritten to teach SQL patterns

**Benefits:**
- Full SQL expressiveness (JOINs, aggregates, CTEs)
- Infinite query possibilities vs 6 fixed queries
- More aligned with MCP philosophy
- Smaller codebase (2 tools instead of 8)

**Security:**
- Database read-only
- Validation blocks write operations
- Parameterized queries prevent SQL injection
- Row limits prevent overwhelming responses

---

## [2026-01-26] Shell Scripts Organization

**Reorganized all shell scripts into `shell_scripts/` directory**

- Keeps project root clean
- All scripts updated with correct relative paths