README.md

Skraak

Acoustic monitoring database toolkit in Go. Use as an MCP server for AI assistants or as a CLI for direct access.

MCP Server

Start the server for use with Claude Desktop or other MCP clients:

./skraak mcp --db ./db/skraak.duckdb

Claude Code config* claude mcp add --transport stdio skraak_mcp -- /home/david/go/src/skraak/skraak mcp --db /home/david/go/src/skraak/db/skraak.duckdb

claude mcp add --transport stdio test_mcp -- /home/david/go/src/skraak/skraak mcp --db /home/david/go/src/skraak/db/test.duckdb

remove: claude mcp remove skraak_mcp

Claude Desktop config (~/.config/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "skraak": {
      "command": "/home/david/go/src/skraak/skraak",
      "args": ["mcp", "--db", "/home/david/go/src/skraak/db/skraak.duckdb"]
    }
  }
}

Available MCP Tools (9 total)

Query:

  • execute_sql - Run SQL SELECT queries (JOINs, aggregates, CTEs supported)
  • get_current_time - Current time with timezone

Write:

  • create_or_update_dataset - Create or update a dataset
  • create_or_update_location - Create or update a location with GPS/timezone
  • create_or_update_cluster - Create or update a cluster within a location
  • create_or_update_pattern - Create or update a cyclic recording pattern

Import:

  • import_audio_files - Batch import WAV files from a folder
  • import_ml_selections - Import ML-detected selections from folder structure

Resources & Prompts

MCP server provides:

  • schema://full - Complete database schema
  • schema://table/{name} - Individual table definitions
  • 6 SQL workflow prompts teaching query patterns

CLI Commands

# Execute SQL query
./skraak sql --db ./db/skraak.duckdb "SELECT COUNT(*) FROM file WHERE active = true"

# Create resources
./skraak create dataset --db ./db/skraak.duckdb --name "My Dataset" --type unstructured
./skraak create location --db ./db/skraak.duckdb --dataset abc123 --name "Site A" --lat -36.85 --lon 174.76 --timezone Pacific/Auckland
./skraak create cluster --db ./db/skraak.duckdb --dataset abc123 --location loc456 --name "2024-01" --sample-rate 250000
./skraak create pattern --db ./db/skraak.duckdb --record 60 --sleep 1740

# Update resources
./skraak update dataset --db ./db/skraak.duckdb --id abc123 --name "Updated Name"
./skraak update location --db ./db/skraak.duckdb --id loc123 --name "Updated Name" --lat -36.85 --lon 174.76
./skraak update cluster --db ./db/skraak.duckdb --id cluster123 --name "Updated Name"
./skraak update pattern --db ./db/skraak.duckdb --id pattern123 --record 30 --sleep 1770

# Import commands
./skraak import file --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --file /path/to/file.wav
./skraak import folder --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --folder /path/to/folder
./skraak import bulk --db ./db/skraak.duckdb --dataset abc123 --csv import.csv --log progress.log
./skraak import unstructured --db ./db/skraak.duckdb --dataset 4Sh8_7p1ocks --folder "/media/david/Misc-2/Manu o Kahurangi kiwi survey (3)/Andrew Digby LSK - sorted files"
./skraak import selections --db ./db/skraak.duckdb --dataset abc123 --cluster clust789 --folder /path/to/Clips_filter_date
./skraak import segments --db ./db/skraak.duckdb --dataset abc123 --location loc456 --cluster clust789 --folder /path/to/data --mapping mapping.json

# Export dataset (for collaboration, testing, or archival)
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --dry-run

# Event log replay (sync backup databases)
./skraak replay events --db ./backup.duckdb --log ./skraak.duckdb.events.jsonl
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --dry-run
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --last 10

# Call analysis (extract from ML predictions, review/classify)
./skraak calls from-preds --csv predictions.csv                    # Extract calls, write .data files
./skraak calls from-preds --csv preds.csv --dot-data=false > calls.json  # JSON output only
./skraak calls show-images --file recording.wav.data               # Display spectrograms
./skraak calls classify --folder ./data --reviewer David --bind k=Kiwi  # Interactive classification
./skraak calls summarise --folder ./data > summary.json            # Summarise .data files
./skraak calls summarise --folder ./data --brief > summary.json    # Summary stats only (no segments)

./skraak calls classify --folder . \
  --reviewer David \
  --bind k=Kiwi \
  --bind d="Kiwi+Duet" \
  --bind f="Kiwi+Female" \
  --bind m="Kiwi+Male" \
  --bind n="Don't Know" \
  --bind p=Morepork \
  --bind w=weka \
  --color \
  --img-dims 224

./skraak calls classify --folder . --reviewer David --color \
  --bind a=eurbla, \
  --bind b=nezbel1, \
  --bind c=comcha, \
  --bind d=saddle3, \
  --bind e=pipipi1, \
  --bind f=nezfan1, \
  --bind g=gryger1, \
  --bind i=tui1, \
  --bind k=kea1, \
  --bind l=lotkoe1, \
  --bind m=morepo2, \
  --bind n=nezrob3, \
  --bind o=soioys1, \
  --bind p=malpar2, \
  --bind r=riflem1, \
  --bind s=silver3, \
  --bind t=tomtit1, \
  --bind u=nezpig2, \
  --bind v=brncre, \
  --bind w=nezkak1,  \
  --bind x=Noise, \
  --bind z="Don't Know", \
  --bind 1=Kiwi+Duet, \
  --bind 2=Kiwi+Female, \
  --bind 3=Kiwi+Male, \
  --bind 4=Kiwi, \
  --bind 5=Gecko
  
# File utilities
./skraak xxhash --file recording.wav     # XXH64 hash (same format as DB)
./skraak metadata --file recording.wav   # WAV metadata as JSON
./skraak time                            # Current time as JSON

Event Log

All mutating SQL operations (INSERT, UPDATE, DELETE) are automatically logged for backup synchronization.

Event log location: <database>.events.jsonl

Features:

  • SQL-level capture for complete fidelity
  • Only successful transactions logged (rollbacks discarded)
  • Includes tool name, SQL, parameters, timestamp

Replay on backup database:

# Replay all events
./skraak replay events --db ./backup.duckdb --log ./skraak.duckdb.events.jsonl

# Preview without executing
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --dry-run

# Replay last N events
./skraak replay events --db ./backup.duckdb --log ./events.jsonl --last 10

Event format (JSONL):

{
  "id": "V1StGXR8_Z5jdHi6B-myT",
  "timestamp": "2026-02-18T14:30:22+13:00",
  "tool": "create_or_update_dataset",
  "queries": [{"sql": "INSERT INTO ...", "parameters": [...]}],
  "success": true,
  "duration_ms": 45
}

Dataset Export

Export a dataset with all related data to a new DuckDB database for collaboration, testing, or archival.

Use cases:

  • Collaboration: Export, send to collaborator, they return event log for replay
  • Testing: Create focused test database from production (100 MB vs 1.5 GB)
  • Archival: Snapshot a dataset at a point in time

Export:

# Export dataset to new database
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb

# Preview without creating file
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --dry-run

# Overwrite existing export
./skraak export dataset --db ./db/skraak.duckdb --id abc123 --output export.duckdb --force

What's exported:

  • All rows owned by dataset (via dataset_id foreign key traversal)
  • Subset of reference data (species, patterns, filters used)
  • Creates empty event log file for changes

Re-import changes:

# After collaborator returns event log, replay on backup
./skraak replay events --db ./backup.duckdb --log export.duckdb.events.jsonl

Call Analysis

Extract and review bird calls from ML predictions.

Workflow:

  1. Extract calls from opensoundscape predictions.csv:
# Write .data files alongside audio (default)
# filter parsed from preds.csv filename but can be overriden with --filter birdnet-24
./skraak calls from-preds --csv predictions.csv > calls.json
  1. Interactive classification:
# Launch TUI for reviewing and classifying segments (folder, reviewer and 1 key bind required)
./skraak calls classify --folder ./data --reviewer David \
    --bind k=Kiwi --bind d='Kiwi+Duet' --bind n='Don''t Know'

# Single file mode
./skraak calls classify --file recording.wav.data --reviewer David --bind k=Kiwi --bind n='Don''t Know'

# With color and custom image size (clamps to 224px to 896px)
./skraak calls classify --folder ./data --reviewer David --bind k=Kiwi --color --img-dims 224
  1. Summarise .data files:
# Full summary with all segments
./skraak calls summarise --folder ./recordings > summary.json

# Brief summary (stats only, no segment details)
./skraak calls summarise --folder ./recordings --brief > summary.json

Summarise output includes:

  • segments - array of all segments with labels (omitted with --brief)
  • data_files_read / data_files_skipped - file processing status
  • total_segments - total count
  • filters - per-filter statistics (segments, species, calltypes)
  • review_status - unreviewed/confirmed/dont_know counts
  • operators / reviewers - unique values found

Key bindings format:

  • k=Kiwi - Press 'k' to classify as Kiwi (species only)
  • d=Kiwi+Duet - Press 'd' to classify as Kiwi with Duet call type

Segments Import

Import AviaNZ .data segments into the database with species/calltype mapping.

Prerequisites:

  1. WAV files must already be imported (hashes must exist in database)
  2. No existing labels on files (fresh imports only)
  3. All filters, species, and calltypes must exist in database
  4. Mapping file must cover all species in .data files
  5. Filters / Models must already exist in the database

Mapping file (mapping_2026-03-13.json): use claude skill to guide user through creation of species calltype mapping to db

{
  "Don't Know": {
    "species": "Don't Know"
  },
  "GSK": {
    "species": "Roroa",
    "calltypes": {
      "Male": "Male - Solo",
      "Female": "Female - Solo"
    }
  }
}

Import Segments:

./skraak import segments \
  --db ./db/skraak.duckdb \
  --dataset dataset_id \
  --location location_id \
  --cluster cluster_id \
  --folder /path/to/data \
  --mapping mapping.json

What's imported:

  • segment - time ranges with freq_low/freq_high from .data
  • label - species, filter, certainty for each segment
  • label_subtype - calltype if present in .data
  • label_metadata - stores comments (if present)

Data file updates:

  • skraak_hash written to metadata section
  • skraak_label_id written to each label object

Bookmarks: Segments with bookmark: true are imported normally; the bookmark flag is ignored (not stored in database).

Development

# Build
go build -o skraak

# Run tests
go test ./...

# Run with coverage
go test -cover ./...

Cross-Compile to Windows (from Ubuntu)

DuckDB's Go bindings use CGO with pre-built static libraries. Cross-compiling to Windows requires MinGW and a small ABI compatibility stub.

Prerequisites:

sudo apt install gcc-mingw-w64-x86-64 g++-mingw-w64-x86-64

# Switch to posix threading variant (DuckDB uses pthreads)
sudo update-alternatives --set x86_64-w64-mingw32-gcc /usr/bin/x86_64-w64-mingw32-gcc-posix
sudo update-alternatives --set x86_64-w64-mingw32-g++ /usr/bin/x86_64-w64-mingw32-g++-posix

Build:

# Create ABI stub (Ubuntu MinGW defines mbstate_t as int, DuckDB expects _Mbstatet)
echo 'extern "C" { void* _ZNSt15basic_streambufIcSt11char_traitsIcEE7seekposESt4fposI9_MbstatetESt13_Ios_Openmode() { return (void*)-1; } }' \
  | tee /tmp/stub_seekpos.cpp
x86_64-w64-mingw32-g++ -c /tmp/stub_seekpos.cpp -o /tmp/stub_seekpos.o

# Cross-compile (windows-amd64 only)
CGO_ENABLED=1 \
  CC=x86_64-w64-mingw32-gcc \
  CXX=x86_64-w64-mingw32-g++ \
  GOOS=windows GOARCH=amd64 \
  go build -ldflags '-extldflags "/tmp/stub_seekpos.o -lucrt"' -o skraak.exe

Verify:

file skraak.exe
# Expected: PE32+ executable (console) x86-64, for MS Windows

See CLAUDE.md for detailed development notes.