# Skraak
Acoustic monitoring CLI toolkit in Go.
## CLI Commands
```bash
# Execute SQL query
./skraak sql --db ./db/skraak.duckdb "SELECT COUNT(*) FROM file WHERE active = true"
# Create resources
./skraak create dataset --db ./db/skraak.duckdb --name "My Dataset" --type unstructured
./skraak create location --db ./db/skraak.duckdb --dataset abc123 --name "Site A" --lat -36.85 --lon 174.76 --timezone Pacific/Auckland
./skraak create cluster --db ./db/skraak.duckdb --dataset abc123 --location loc456 --name "2024-01" --sample-rate 250000
./skraak create pattern --db ./db/skraak.duckdb --record 60 --sleep 1740
# Update resources
./skraak update dataset --db ./db/skraak.duckdb --id abc123 --name "Updated Name"
./skraak update location --db ./db/skraak.duckdb --id loc123 --name "Updated Name" --lat -36.85 --lon 174.76
./skraak update cluster --db ./db/skraak.duckdb --id cluster123 --name "Updated Name"
./skraak update pattern --db ./db/skraak.duckdb --id pattern123 --record 30 --sleep 1770
# Import commands
./skraak import file --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --file /path/to/file.wav
./skraak import folder --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --folder /path/to/folder
./skraak import bulk --db ./db/skraak.duckdb --dataset abc123 --csv import.csv --log progress.log
./skraak import unstructured --db ./db/skraak.duckdb --dataset 4Sh8_7p1ocks --folder "/media/david/Misc-2/Manu o Kahurangi kiwi survey (3)/Andrew Digby LSK - sorted files"
./skraak import segments --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --folder /path/to/data --mapping mapping.json # requires mapping.json
# Export dataset (for collaboration, testing, or archival)
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --dry-run
# Event log replay (sync backup databases)
./skraak replay events --db ./backup.duckdb --log ./skraak.duckdb.events.jsonl
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --dry-run
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --last 10
# Create .data files
./skraak calls from-preds --csv predictions.csv # Extract calls from OPSO, write .data files
./skraak calls from-preds --csv preds.csv --dot-data=false > calls.json # JSON output only
# Summarise .data files
./skraak calls summarise --folder ./data > summary.json # Summarise .data files
./skraak calls summarise --folder ./data --brief > summary.json # Summary stats only (no segments)
# Display spectrograms
./skraak calls show-images --file recording.wav.data
# TUI for manual classification (reviewer + bindings from ~/.skraak/config.json)
./skraak calls classify --folder ./data # Interactive classification
./skraak calls classify --folder ./data --filter opensoundscape-kiwi-1.0
./skraak calls classify --folder . --filter opensoundscape-kiwi-1.2 --species Kiwi+Male
./skraak calls classify --folder . --filter opensoundscape-multi-1.0
# Agentic call analysis
./skraak calls clip --file recording.wav.data --prefix B01 --output /tmp/B01/ --species Kiwi+Duet --filter opensoundscape-multi-1.0 --size 224 --color
./skraak calls clip --folder B01/2026-12-11/ --prefix B01 --output /tmp/B01/ --species Kiwi+Duet --filter opensoundscape-multi-1.0 --size 224 --color
./skraak calls modify --file recording.data --reviewer Claude --filter opensoundscape-multi-1.0 --segment 12-15 --species Kiwi+Male --certainty 80
./skraak calls modify --file recording.data --reviewer Claude --filter opensoundscape-multi-1.0 --segment 12-15 --certainty 80 --bookmark
./skraak calls modify --file recording.data --reviewer Claude --filter opensoundscape-multi-1.0 --segment 12-15 --certainty 80 --comment "Clear example of male call"
./skraak calls propagate --file rec.wav.data --from opensoundscape-kiwi-1.2 --to opensoundscape-kiwi-1.5 --species Kiwi
./skraak calls propagate --folder ./recordings --from opensoundscape-kiwi-1.2 --to opensoundscape-kiwi-1.5 --species Kiwi
# .data files to OPSO multihot csv (requires mapping.json)
./skraak calls clip-labels --folder ./data --mapping ./mapping.json
./skraak calls clip-labels --folder ./data --mapping ./mapping.json --filter opensoundscape-multi-1.0
# File utilities
./skraak xxhash --file recording.wav # XXH64 hash (same format as DB)
./skraak metadata --file recording.wav # WAV metadata as JSON
# Works for audiomoth which records time metadata as UTC
./skraak isnight --file recording.wav --lat -36.85 --lng 174.76 # Was it night when recorded?
./skraak isnight --file recording.wav --lat -36.85 --lng 174.76 --brief # Just file_path + solar_night
# DOC recorders record local time without timezone, IANA timezone required
./skraak isnight --file recording.wav --lat -36.85 --lng 174.76 --timezone Pacific/Auckland # Non-UTC timezone
# Rename files with location prefix
./skraak prepend --folder ./recordings --prefix LOC001 # WAV files with datestring + log.txt
./skraak prepend --folder ./data --prefix SITE_A --recursive # Include 1 level of subfolders
./skraak prepend --folder ./test --prefix TEST --dry-run # Preview changes
# Get current time
./skraak time # Current time as JSON
```
## Event Log
All mutating SQL operations (INSERT, UPDATE, DELETE) are automatically logged for backup synchronization.
**Event log location:** `<database>.events.jsonl`
**Features:**
- SQL-level capture for complete fidelity
- Only successful transactions logged (rollbacks discarded)
- Includes tool name, SQL, parameters, timestamp
**Replay on backup database:**
```bash
# Replay all events
./skraak replay events --db ./backup.duckdb --log ./skraak.duckdb.events.jsonl
# Preview without executing
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --dry-run
# Replay last N events
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --last 10
```
**Event format (JSONL):**
```json
{
"id": "V1StGXR8_Z5jdHi6B-myT",
"timestamp": "2026-02-18T14:30:22+13:00",
"tool": "create_or_update_dataset",
"queries": [{"sql": "INSERT INTO ...", "parameters": [...]}],
"success": true,
"duration_ms": 45
}
```
## Dataset Export
Export a dataset with all related data to a new DuckDB database for collaboration, testing, or archival.
**Use cases:**
- **Collaboration:** Export, send to collaborator, they return event log for replay
- **Testing:** Create focused test database from production (100 MB vs 1.5 GB)
- **Archival:** Snapshot a dataset at a point in time
**Export:**
```bash
# Export dataset to new database
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb
# Preview without creating file
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --dry-run
# Overwrite existing export
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --force
```
**What's exported:**
- All rows owned by dataset (via dataset_id foreign key traversal)
- Subset of reference data (species, patterns, filters used)
- Creates empty event log file for changes
**Re-import changes:**
```bash
# After collaborator returns event log, replay on backup
./skraak replay events --db ./backup.duckdb --log export.duckdb.events.jsonl
```
## Call Analysis
Extract and review bird calls from ML predictions.
**Workflow:**
1. **Extract calls from opensoundscape predictions.csv:**
```bash
# Write .data files alongside audio (default)
# filter parsed from preds.csv filename but can be overriden with --filter birdnet-24
./skraak calls from-preds --csv predictions.csv > calls.json
```
2. **Interactive classification:**
Reviewer, keybindings, and display flags (color/sixel/iterm/img_dims) are loaded
from `~/.skraak/config.json` — create it once before first use:
```json
{
"classify": {
"reviewer": "David",
"color": true,
"bindings": {
"a": "eurbla",
"k": "Kiwi",
"d": "Kiwi+Duet",
"n": "Don't Know",
"1": "Kiwi+Duet",
"2": "Kiwi+Female",
"3": "Kiwi+Male",
"4": "Kiwi",
"x": "Noise"
},
"secondary_bindings":
{
"a":
{
"a": "alarm",
"c": "contact",
"s": "song"
}
}
}
}
```
Path resolves to `~/.skraak/config.json` on Linux/macOS and
`C:\Users\<name>\.skraak\config.json` on Windows via `os.UserHomeDir()`.
**Secondary bindings for a, eurbla, are accessed by shift-a, a/c/s**
```bash
# Launch TUI for reviewing and classifying segments
./skraak calls classify --folder ./data
# Single file mode
./skraak calls classify --file recording.wav.data
# Scope to a specific filter (ML model)
./skraak calls classify --folder ./data --filter opensoundscape-kiwi-1.2
# Scope to species (and optionally calltype) within a filter
./skraak calls classify --folder ./data --filter opensoundscape-kiwi-1.2 --species Kiwi+Duet
# Sample 10% of matching segments (random, requires --certainty; useful for quality-checking large sets)
./skraak calls classify --folder ./data --species Kiwi --certainty 90 --sample 10
```
`--sample <1-99>` randomly selects that percentage of the filtered segment list for review. Files and segments are presented in their original chronological order. `--sample 100` is a no-op. Requires `--certainty` to be set.
3. **Summarise .data files:**
```bash
# Full summary with all segments
./skraak calls summarise --folder ./recordings > summary.json
# Brief summary (stats only, no segment details)
./skraak calls summarise --folder ./recordings --brief > summary.json
```
**Summarise output includes:**
- `segments` - array of all segments with labels (omitted with `--brief`)
- `data_files_read` / `data_files_skipped` - file processing status
- `total_segments` - total count
- `filters` - per-filter statistics (segments, species, calltypes)
- `review_status` - unreviewed/confirmed/dont_know counts
- `operators` / `reviewers` - unique values found
4. **Promote certainty=90 segments to 100:**
```bash
# After reviewing a folder and confirming labels are correct, bulk-promote to certainty=100.
# Filtering flags match calls classify exactly (minus --certainty and --sample).
./skraak calls push-certainty --folder ./data --species Kiwi
./skraak calls push-certainty --folder ./data --species Kiwi --night --location "-45.5,167.4"
```
Sets matching labels from certainty=90 to 100 and updates the reviewer from `~/.skraak/config.json`. Outputs `{"segments_updated": N, "files_updated": M}`.
5. **Propagate verified classifications between filters:**
```bash
# Single file
./skraak calls propagate --file rec.wav.data \
--from opensoundscape-kiwi-1.2 --to opensoundscape-kiwi-1.5 --species Kiwi
# Whole folder
./skraak calls propagate --folder ./recordings \
--from opensoundscape-kiwi-1.2 --to opensoundscape-kiwi-1.5 --species Kiwi
```
Only source labels at certainty=100 matching `--species` are considered. Target labels (filter=`--to`) at certainty 70 or 0 are upgraded to certainty=90 and the file reviewer is set to `Skraak`. Targets already at 100 or 90 are left alone; files missing either filter are skipped.
6. **Export .data files to OpenSoundScape multihot CSV:**
```bash
# Columns = canonical classes from mapping.json
./skraak calls clip-labels --folder ./data --mapping ./mapping.json
# Restrict to a single ML filter
./skraak calls clip-labels --folder ./data --mapping ./mapping.json --filter opensoundscape-multi-1.0
```
- **`"__NEGATIVE__"`** — segmens IS emitted, **all class columns False**.
- **`"__IGNORE__"`** — the segment is not in the dataset.
```
{
"Kiwi": {"species": "Kiwi"},
"Geese": {"species": "__NEGATIVE__"},
"Not": {"species": "__NEGATIVE__"},
"Don't Know": {"species": "__IGNORE__"}
}
```
**`--filter F`** restricts which ML filter's labels count
(`opensoundscape-multi-1.0`, `BirdNET`, `Raven`, …). The mapping
coverage check also restricts to that filter.
Defaults: `--clip-duration 4 --clip-overlap 0.5 --min-label-overlap 0.25 --final-clip full`.
If `--output` file already exists, the run **appends** to the file.
## Import .data files to database
Import AviaNZ .data segments into the database with species/calltype mapping.
**Prerequisites:**
1. WAV files must already be imported (hashes must exist in database)
2. No existing labels on files (fresh imports only)
3. All filters, species, and calltypes must exist in database
4. Mapping file must cover all species in .data files
5. Filters / Models must already exist in the database
**Mapping file** (`mapping_2026-03-13.json`):
use agent skill /data-mapping to create mapping
```json
{
"Don't Know": {
"species": "Don't Know"
},
"GSK": {
"species": "Roroa",
"calltypes": {
"Male": "Male - Solo",
"Female": "Female - Solo"
}
}
}
```
**Import Segments:**
```bash
./skraak import segments \
--db ./db/skraak.duckdb \
--dataset dataset_id \
--location location_id \
--cluster cluster_id \
--folder /path/to/data \
--mapping mapping.json
```
**What's imported:**
- `segment` - time ranges with freq_low/freq_high from .data
- `label` - species, filter, certainty for each segment
- `label_subtype` - calltype if present in .data
- `label_metadata` - stores comments (if present)
**Data file updates:**
- `skraak_hash` written to metadata section
- `skraak_label_id` written to each label object
**Bookmarks:** Segments with `bookmark: true` are imported normally; the bookmark flag is ignored (not stored in database).
## Development
```bash
# Build
go build -o skraak
# Run tests
go test ./...
# Run with coverage
go test -cover ./...
```
```bash
# All Makefile (make sure db/test.duckdb available and FK's applied)
make test
```
```
# Keep cyclomatic complexity low
gocyclo -over 10 .
```
### Cross-Compile to Windows (from Ubuntu)
DuckDB's Go bindings use CGO with pre-built static libraries. Cross-compiling to Windows requires MinGW and a small ABI compatibility stub.
**Prerequisites:**
```bash
sudo apt install gcc-mingw-w64-x86-64 g++-mingw-w64-x86-64
# Switch to posix threading variant (DuckDB uses pthreads)
sudo update-alternatives --set x86_64-w64-mingw32-gcc /usr/bin/x86_64-w64-mingw32-gcc-posix
sudo update-alternatives --set x86_64-w64-mingw32-g++ /usr/bin/x86_64-w64-mingw32-g++-posix
```
**Build:**
```bash
# Create ABI stub (Ubuntu MinGW defines mbstate_t as int, DuckDB expects _Mbstatet)
echo 'extern "C" { void* _ZNSt15basic_streambufIcSt11char_traitsIcEE7seekposESt4fposI9_MbstatetESt13_Ios_Openmode() { return (void*)-1; } }' \
| tee /tmp/stub_seekpos.cpp
x86_64-w64-mingw32-g++ -c /tmp/stub_seekpos.cpp -o /tmp/stub_seekpos.o
# Cross-compile (windows-amd64 only)
CGO_ENABLED=1 \
CC=x86_64-w64-mingw32-gcc \
CXX=x86_64-w64-mingw32-g++ \
GOOS=windows GOARCH=amd64 \
go build -ldflags '-extldflags "/tmp/stub_seekpos.o -lucrt"' -o skraak.exe
```
**Verify:**
```bash
file skraak.exe
# Expected: PE32+ executable (console) x86-64, for MS Windows
```
See `CLAUDE.md` for detailed development notes.