1. utils/mapping.go (189 lines)
- SpeciesMapping struct for mapping .data species/calltype to DB labels
- MappingFile type for the complete mapping structure
- LoadMappingFile() - loads and validates mapping JSON
- ValidateMappingAgainstDB() - validates all mapped species/calltypes
exist in DB
- GetDBSpecies() / GetDBCalltype() - helper methods
2. tools/import_segments.go (840 lines)
- ImportSegmentsInput / ImportSegmentsOutput types
- ImportSegments() - main import function with:
- Phase A: Input validation (folder, mapping file, ID formats)
- Phase B: Parse all .data files, collect unique
filters/species/calltypes
- Phase C: Pre-import validation (filters, mapping coverage, file
hashes, no existing labels)
- Phase D: Transactional import (segment → label → label_metadata →
label_subtype)
- ProgressHandler support
- Fail-fast error handling with detailed error messages
3. PLAN.md (reference document for context preservation)
### Files Modified
1. cmd/import.go
- Added segments subcommand to switch statement
- Updated usage text
- Added runImportSegments() function with CLI flag parsing
### Key Features
- ✅ Maps .data species/calltype names to DB labels via mapping file
- ✅ Validates all filters, species, and calltypes exist in DB
- ✅ Validates file hashes exist in DB for the specified cluster
- ✅ Ensures no existing labels (fresh imports only)
- ✅ Skips bookmarked segments
- ✅ Stores freq_low/freq_high in segment table
- ✅ Writes skraak_hash to file_metadata
- ✅ Writes skraak_label_id and comment to label_metadata
- ✅ Supports multiple labels per segment
- ✅ Single transaction for atomic imports
- ✅ Progress reporting via ProgressHandler
### CLI Usage
```bash
skraak import segments \
--db ./db/skraak.duckdb \
--dataset <12-char-id> \
--location <12-char-id> \
--cluster <12-char-id> \
--folder /path/to/data \
--mapping mapping.json
```
CHANGELOG.md - Added new entry at top:
- [2026-03-14] Import Segments Command
- Documents the feature, usage, mapping format, output structure, invariants,
and rationale
README.md - Updated two sections:
1. Added import segments to the CLI commands examples
2. Added new "Segments Import" section with:
- Prerequisites (hashes exist, fresh imports, filters/species/calltypes
in DB)
- Mapping file format
- Import command example
- What's imported (segment, label, label_subtype, label_metadata,
file_metadata)
- What's skipped (bookmarked segments)
The command is now fully documented and ready for use.
Review: skraak import segments Implementation
Context
Reviewing the new import segments CLI command which imports AviaNZ .data file
segments into the database, applying species/calltype mapping. The
implementation spans cmd/import.go, tools/import_segments.go, and
utils/mapping.go.
Critical Bug: Import loop processes zero files
In importSegmentsIntoDB() (tools/import_segments.go:556), the loop iterates
scannedFiles and skips entries where sf.FileID == "":
for _, sf := range scannedFiles {
if sf.FileID == "" {
continue // skips ALL files
}
But FileID is only set in validateAndMapFiles() on copies (Go value
semantics). The modified copies go into fileIDMap, but the original
scannedFiles slice is never mutated. Result: sf.FileID is always "", zero
files are imported.
Fix: Iterate fileIDMap instead of scannedFiles. Remove scannedFiles
parameter. Use len(fileIDMap) for totalFiles.
Medium Issues
1. Partial label writes on calltype failure (line 752-758)
When calltype ID lookup fails, continue skips to next label — but label and
label_metadata rows were already inserted. Creates labels without their
expected subtypes.
Fix: Check calltype resolution before inserting the label, or treat as
segment-level error.
2. Fragile JSON construction for label_metadata (line 710-716)
escapedComment := strings.ReplaceAll(label.Comment, `"`, `\"`)
Only escapes ". Backslashes, newlines, tabs are not handled — produces
invalid JSON if comment contains \.
Fix: Use json.Marshal to build the metadata map.
3. placeholders() duplicated
Defined in utils/mapping.go:262 and tools/import_ml_selections.go:1125.
import_segments.go uses it at line 330 — needs the tools/ package copy.
Consider moving to utils/ as the single source.
Minor Issues
4. Progress percentage inaccurate
totalFiles = len(scannedFiles) includes failed-validation files. Resolves
naturally when iterating fileIDMap instead.
5. file_metadata INSERT uses string concat in SQL (line 568)
json('{"skraak_hash": "' || ? || '"}')
Safe (hex-only hash), but the ON CONFLICT clause already uses json_set().
Could use json_set for both for consistency.
What's Good
- Thorough 4-phase validation before any DB writes
- Single transaction with rollback
- Event logging for audit trail
- Hash-based file matching (not filename)
- Fresh-import-only guard
- Bookmark skip logic
- Comprehensive error collection with stage tags
- Clean two-layer architecture
Files to Modify
1. tools/import_segments.go — critical bug + medium issues 1-2
Verification
go build -o skraak
# Run against test data with known .data files and mapping
# Verify segments/labels appear in DB with correct species/calltype IDs
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
# from README.md
**What's imported:**
- `segment` - time ranges with freq_low/freq_high from .data YES
- `label` - species, filter, certainty for each segment YES
- `label_subtype` - calltype if present in .data YES
- `label_metadata` - stores comments YES XXXX Not `skraak_label_id`, writes this to .data file XXXX
- `file_metadata` - XXXX Not `skraak_hash, writes this to .data file` XXXX
**Skipped:** NO NO NO, imports the segment but ignores the bookmark:true metadata
- Segments with `bookmark: true` labels (navigation markers) NO NO NO, imports the segment but ignores the bookmark:true metadata
# from CHANGELOG.md
**Database writes:**
- `segment` table: id, file_id, dataset_id, start_time, end_time, freq_low, freq_high
- `label` table: id, segment_id, species_id, filter_id, certainty
- `label_metadata` table: `{ "comment": "..."}`XXX Not "skraak_label_id": "...", XXX
- `label_subtype` table: id, label_id, calltype_id, filter_id, certainty (if calltype present)
- XXX NO `file_metadata` table: `{"skraak_hash": "..."}` XXX
from @cmd/import.go
fmt.Fprintf(os.Stderr, " - Segments with bookmark=true labels are skipped\n") //WRONG