# Testing the Skraak MCP Server
## Overview
The Skraak MCP Server provides 10 tools across three categories:
- **Read tools (2)**: `get_current_time`, `execute_sql`
- **Write tools (4)**: `create_or_update_dataset`, `create_or_update_location`, `create_or_update_cluster`, `create_or_update_pattern`
- **Import tools (4)**: `import_audio_files`, `import_file`, `import_ml_selections`, `bulk_file_import`
Plus resources (schema) and prompts (SQL workflow templates).
## Quick Testing with Shell Scripts
### ⚠️ CRITICAL: Always Use Test Database
**NEVER use the production database (`skraak.duckdb`) for testing!**
All test scripts default to the test database when run without arguments:
```bash
# Correct - uses test database (default)
./test_sql.sh
# Also correct - explicit test database path
./test_sql.sh /home/david/go/src/skraak_mcp/db/test.duckdb
# WRONG - uses production database
./test_sql.sh /home/david/go/src/skraak_mcp/db/skraak.duckdb # ❌ DON'T DO THIS
```
**Path handling:**
- Simple scripts (`get_time.sh`, `test_sql.sh`, etc.) use relative paths: `../db/test.duckdb`
- Complex scripts (`test_tools.sh`, `test_import_file.sh`, etc.) resolve absolute paths at runtime
- All work correctly when run from the `shell_scripts/` directory
### Available Test Scripts
All scripts are located in the `shell_scripts/` directory:
#### 1. Core Functionality Tests
**Time tool:**
```bash
./get_time.sh
```
Tests the `get_current_time` tool. No database required.
**SQL query tool:**
```bash
./test_sql.sh > test_output.txt 2>&1
```
Tests the `execute_sql` tool with various queries:
- Simple SELECT queries
- Parameterized queries (? placeholders)
- JOINs and aggregates
- Security validation (blocks INSERT/UPDATE/DELETE)
- Row limiting
**Always pipe to file!** SQL tests produce large output.
#### 2. Create/Update Tools Tests
**Comprehensive create_or_update tool test:**
```bash
./test_tools.sh > write_test.txt 2>&1
```
Tests all 4 create_or_update tools in both create and update modes:
- `create_or_update_dataset` - create (no id) and update (with id)
- `create_or_update_location` - create and update
- `create_or_update_cluster` - create and update
- `create_or_update_pattern` - create and update
- Valid inputs (should succeed)
- Invalid inputs (should fail with validation errors)
#### 3. Import Tools Tests
**Import folder of WAV files:**
```bash
./test_import_file.sh > import_test.txt 2>&1
```
Tests the `import_file` tool (single file import):
- Valid WAV file import
- AudioMoth metadata parsing
- Timestamp extraction
- Hash computation and duplicate detection
- Astronomical data calculation
**Import ML detection selections:**
```bash
./test_import_selections.sh > selections_test.txt 2>&1
```
Tests the `import_ml_selections` tool:
- Folder structure parsing (`Clips_{filter}_{date}/Species/CallType/*.wav+.png`)
- Selection filename parsing (`base-start-end.wav`)
- File matching (exact and fuzzy)
- Validation of filter, species, call types
**Bulk import across locations:**
```bash
./test_bulk_import.sh > bulk_test.txt 2>&1
```
Tests the `bulk_file_import` tool:
- CSV parsing
- Location validation
- Auto-cluster creation
- Progress logging
- Error handling
**CSV Format for bulk_file_import:**
```csv
location_name,location_id,directory_path,date_range,sample_rate,file_count
"Test Location","testloc12345","/path/to/test/files","test-2024","48000","10"
```
**Required columns (in order):**
1. `location_name` - Display name (string, can have spaces)
2. `location_id` - 12-character ID (must exist in database)
3. `directory_path` - Absolute path to WAV files
4. `date_range` - Becomes cluster name (any string)
5. `sample_rate` - Sample rate in Hz (integer)
6. `file_count` - Expected count (integer, informational)
**Verify results with SQL:**
```sql
-- Check clusters created
SELECT name FROM cluster WHERE location_id = 'testloc12345';
-- Check files imported
SELECT COUNT(*) FROM file WHERE cluster_id IN (
SELECT id FROM cluster WHERE location_id = 'testloc12345'
);
```
#### 4. Resources and Prompts Tests
**Resources and prompts:**
```bash
./test_resources_prompts.sh | jq '.'
```
Tests:
- Schema resources (full schema, per-table schemas)
- All 6 prompts (workflow templates)
**All prompts test:**
```bash
./test_all_prompts.sh > prompts_test.txt 2>&1
```
Tests all 6 SQL workflow prompts:
1. `query_active_datasets` - Dataset querying patterns
2. `explore_database_schema` - Interactive schema exploration
3. `explore_location_hierarchy` - Hierarchy navigation with JOINs
4. `query_location_data` - Location analysis with filtering/aggregates
5. `analyze_cluster_files` - File analysis with aggregate functions
6. `system_status_check` - Comprehensive health check
### Analyzing Test Results
**Count responses:**
```bash
rg '"result":' test_output.txt | wc -l
```
**Check for errors:**
```bash
rg '"isError":true' test_output.txt
rg -i "error" test_output.txt | grep -v '"isError"'
```
**View specific test:**
```bash
rg -A 20 "Test 5:" test_output.txt
```
## Manual JSON-RPC Testing
You can send messages manually via stdin:
```bash
./skraak_mcp ./db/test.duckdb
```
Then type these JSON-RPC messages (one per line):
### 1. Initialize
```json
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}
```
### 2. List Tools
```json
{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}
```
### 3. Call get_current_time
```json
{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"get_current_time","arguments":{}}}
```
### 4. Call execute_sql (simple query)
```json
{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"execute_sql","arguments":{"query":"SELECT * FROM dataset WHERE active = true LIMIT 10"}}}
```
### 5. Call execute_sql (parameterized query)
```json
{"jsonrpc":"2.0","id":5,"method":"tools/call","params":{"name":"execute_sql","arguments":{"query":"SELECT * FROM location WHERE dataset_id = ?","parameters":["vgIr9JSH_lFj"]}}}
```
### 6. Get schema resource
```json
{"jsonrpc":"2.0","id":6,"method":"resources/read","params":{"uri":"schema://full"}}
```
### 7. Get table schema
```json
{"jsonrpc":"2.0","id":7,"method":"resources/read","params":{"uri":"schema://table/dataset"}}
```
### 8. Get prompt
```json
{"jsonrpc":"2.0","id":8,"method":"prompts/get","params":{"name":"query_active_datasets"}}
```
### 9. Create dataset (no id = create mode)
```json
{"jsonrpc":"2.0","id":9,"method":"tools/call","params":{"name":"create_or_update_dataset","arguments":{"name":"Test Dataset","description":"Testing create tool","type":"test"}}}
```
### 10. Update dataset (id provided = update mode)
```json
{"jsonrpc":"2.0","id":10,"method":"tools/call","params":{"name":"create_or_update_dataset","arguments":{"id":"YOUR_DATASET_ID","name":"Updated Name"}}}
```
### 11. Create location
```json
{"jsonrpc":"2.0","id":11,"method":"tools/call","params":{"name":"create_or_update_location","arguments":{"dataset_id":"YOUR_DATASET_ID","name":"Test Location","latitude":-41.2865,"longitude":174.7762,"timezone_id":"Pacific/Auckland"}}}
```
## Expected Responses
### Initialize Response
```json
{
"jsonrpc":"2.0",
"id":1,
"result":{
"capabilities":{"logging":{},"tools":{"listChanged":true}},
"protocolVersion":"2024-11-05",
"serverInfo":{"name":"skraak_mcp","version":"v1.0.0"}
}
}
```
### List Tools Response
Returns 10 tools:
- `get_current_time` - Time utility
- `execute_sql` - Generic SQL query execution
- `create_or_update_dataset`, `create_or_update_location`, `create_or_update_cluster`, `create_or_update_pattern`
- `import_audio_files`, `import_file`, `import_ml_selections`, `bulk_file_import`
### Execute SQL Response
```json
{
"jsonrpc":"2.0",
"id":4,
"result":{
"content":[{
"type":"text",
"text":"{\"columns\":[\"id\",\"name\",\"type\",\"active\"],\"rows\":[[\"abc123\",\"Dataset 1\",\"organise\",true],[\"def456\",\"Dataset 2\",\"test\",true]],\"row_count\":2,\"total_columns\":4,\"has_more\":false}"
}]
}
}
```
### Schema Resource Response
```json
{
"jsonrpc":"2.0",
"id":6,
"result":{
"contents":[{
"uri":"schema://full",
"mimeType":"text/plain",
"text":"-- Full 348-line database schema..."
}]
}
}
```
## SQL Query Examples
### Basic Queries
**Get all active datasets:**
```sql
SELECT id, name, type, description, active
FROM dataset
WHERE active = true
ORDER BY type, name;
```
**Get locations for a dataset (parameterized):**
```json
{
"query": "SELECT id, name, latitude, longitude FROM location WHERE dataset_id = ? AND active = true",
"parameters": ["vgIr9JSH_lFj"]
}
```
### JOINs
**Dataset hierarchy with counts:**
```sql
SELECT
d.name as dataset,
COUNT(DISTINCT l.id) as location_count,
COUNT(DISTINCT c.id) as cluster_count,
COUNT(f.id) as file_count
FROM dataset d
LEFT JOIN location l ON d.id = l.dataset_id
LEFT JOIN cluster c ON l.id = c.location_id
LEFT JOIN file f ON c.id = f.cluster_id
WHERE d.active = true
GROUP BY d.name
ORDER BY d.name;
```
### Aggregates
**Cluster file statistics:**
```sql
SELECT
COUNT(*) as total_files,
SUM(duration) as total_duration,
AVG(duration) as avg_duration,
MIN(timestamp_local) as first_recording,
MAX(timestamp_local) as last_recording
FROM file
WHERE cluster_id = ? AND active = true;
```
### Temporal Analysis
**Daily recording counts:**
```sql
SELECT
DATE_TRUNC('day', timestamp_local) as day,
COUNT(*) as recordings,
SUM(duration) as total_seconds
FROM file
WHERE active = true
AND timestamp_local >= '2024-01-01'
GROUP BY day
ORDER BY day
LIMIT 100;
```
## Troubleshooting
**Server immediately exits**: Normal - it waits for stdin input in MCP protocol mode
**"Usage: ./skraak_mcp <path>"**: You must provide database path argument
**JSON parsing errors**: Each JSON message must be on a single line
**No response**: Server outputs to stdout; notifications may appear between responses
**Tool not found**: Initialize the connection first before calling tools
**Database connection failed**: Check the database path exists and is readable
**SQL syntax error**: Check query syntax, use schema resources to verify table/column names
**Test output too large**: Always pipe large test outputs to files, then use `rg` to search
**Validation error**: Check tool input schema - all required fields must be present and valid
**Database locked**: Make sure you're not running multiple tests simultaneously on the same database
## Best Practices
1. **Always use test database** (`test.duckdb`) for testing, never production (`skraak.duckdb`)
2. **Pipe large outputs to files** to avoid token overflow
3. **Use parameterized queries** (? placeholders) for filtering by user input
4. **Include `WHERE active = true`** for main tables (dataset, location, cluster, file)
5. **Use LIMIT** to restrict large result sets
6. **Query existing patterns** before creating new ones (use `execute_sql` to check if pattern exists)
7. **Validate IDs** before using them in write operations
8. **Check error messages** carefully - they contain specific validation details
## Running Unit Tests
Go unit tests cover all utility packages:
```bash
# Run all tests
go test ./...
# Run specific package
go test ./utils/
# Run with coverage
go test -cover ./utils/
# View coverage report
go test -coverprofile=coverage.out ./utils/
go tool cover -html=coverage.out
```
**Test coverage: 91.5%** across 170+ tests