Troubleshooting
Diagnose and resolve common issues with BunnyDB replication. This guide covers error messages, debugging steps, and recovery procedures.
Common Errors and Solutions
1. Replication Slot Already Exists
Error Message:
ERROR: replication slot "bunny_slot_my_mirror" already existsCause: A replication slot from a previous run still exists on the source database.
Solution:
Use the retry endpoint to drop and recreate the slot:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"The retry operation automatically drops the slot and restarts.
Dropping a replication slot discards any WAL data that hasn’t been replicated yet. This may result in data loss if the mirror was mid-sync.
2. Slot is Active
Error Message:
ERROR: replication slot "bunny_slot_my_mirror" is active for PID 12345Cause: Another connection (often from a previous worker or incomplete shutdown) is using the replication slot.
Solution:
Use retry endpoint
The RetryNow signal drops the slot before recreating:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"If retry fails, terminate the connection manually
Find the PID using the slot:
SELECT
slot_name,
active_pid,
pg_terminate_backend(active_pid) AS terminated
FROM pg_replication_slots
WHERE slot_name = 'bunny_slot_my_mirror' AND active;Retry the mirror again
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"3. Workflow Execution Already Completed
Error Message:
ERROR: workflow execution already completedCause: The Temporal workflow has finished (succeeded or failed), but BunnyDB is trying to send a signal or query to it.
Solution:
Use the retry endpoint to start a fresh workflow:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"This creates a new workflow execution and restarts the mirror.
This error often occurs after a mirror has been stopped or failed. Retry creates a new workflow execution from scratch.
4. Heartbeat Timeout
Error Message:
ERROR: activity heartbeat timeoutCause: The worker activity stopped sending heartbeats to Temporal, often because:
- Worker process crashed or was killed
- Activity is stuck in a long-running operation
- Network issues between worker and Temporal
Solution:
Check worker logs
docker compose logs bunny-worker | tail -100Look for crash messages, panics, or OOM errors.
Verify worker is running
docker compose ps bunny-workerIf stopped, restart it:
docker compose up -d bunny-workerCheck Temporal connectivity
Verify worker can reach Temporal:
docker compose exec bunny-worker curl temporal:7233Retry the mirror
Once the worker is healthy:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"BunnyDB sends heartbeats during long-running operations like snapshot and batch apply. Timeouts indicate the worker isn’t processing activities.
5. Relation Does Not Exist (Destination)
Error Message:
ERROR: relation "public.users" does not existCause: The table hasn’t been created on the destination database yet, typically because:
- Initial snapshot was skipped (
do_initial_snapshot: false) - Schema sync hasn’t run
- Table creation failed during snapshot
Solution:
Sync the schema to create missing tables:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/sync-schema \
-H "Authorization: Bearer <token>"This creates tables on the destination without re-copying data.
6. Operation Already in Progress (409)
Error Message:
HTTP 409: Another operation is already in progressCause: BunnyDB is processing another signal (pause, resume, resync, etc.) and cannot accept a new operation simultaneously.
Solution:
Wait for the current operation to complete, then retry:
# Check current status
curl http://localhost:8112/v1/mirrors/my-mirror \
-H "Authorization: Bearer <token>"
# Wait a few seconds
sleep 5
# Retry your operation
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/pause \
-H "Authorization: Bearer <token>"Only one control operation (pause, resume, resync, retry, sync-schema) can be active at a time. This prevents conflicting state changes.
7. Mirror Must Be Paused to Update Tables
Error Message:
HTTP 409: Mirror must be paused to update tablesCause: Attempted to update table mappings while the mirror is running.
Solution:
Pause the mirror
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/pause \
-H "Authorization: Bearer <token>"Update table mappings
curl -X PUT http://localhost:8112/v1/mirrors/my-mirror/tables \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"table_mappings": [...]
}'Resume the mirror
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/resume \
-H "Authorization: Bearer <token>"8. LSN Not Advancing
Symptom: Mirror status shows last_lsn is not changing over time.
Causes:
- No new changes on source database
- Replication slot inactive
- Mirror paused or in error state
- Publication not configured correctly
Diagnosis:
Check mirror status
curl http://localhost:8112/v1/mirrors/my-mirror \
-H "Authorization: Bearer <token>" | jq '.status, .last_lsn, .error_message'Verify replication slot on source
SELECT
slot_name,
active,
confirmed_flush_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes
FROM pg_replication_slots
WHERE slot_name = 'bunny_slot_my_mirror';If active = false, the mirror isn’t connected.
Check publication
SELECT * FROM pg_publication_tables
WHERE pubname = 'bunny_pub_my_mirror';Verify all expected tables are listed.
Generate test changes
Insert/update/delete rows in source tables:
INSERT INTO public.users (username) VALUES ('test-user');Wait for cdc_sync_interval_seconds, then check if LSN advanced.
Solution:
If slot inactive or error state:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"If publication missing tables, use schema sync:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/sync-schema \
-H "Authorization: Bearer <token>"9. Foreign Key Violations
Error Message:
ERROR: insert or update on table "orders" violates foreign key constraint "fk_user_id"Cause: BunnyDB applies batches of changes in parallel, which can temporarily violate foreign key constraints if parent and child rows arrive out of order.
Solution:
BunnyDB automatically handles this by using DEFERRABLE INITIALLY DEFERRED constraints on destination tables. This defers constraint checks until transaction commit.
Ensure constraints are deferrable:
-- Check existing constraints
SELECT
conname,
contype,
condeferrable,
condeferred
FROM pg_constraint
WHERE conrelid = 'public.orders'::regclass;
-- Make constraint deferrable
ALTER TABLE public.orders
DROP CONSTRAINT fk_user_id,
ADD CONSTRAINT fk_user_id
FOREIGN KEY (user_id)
REFERENCES public.users(id)
DEFERRABLE INITIALLY DEFERRED;BunnyDB creates destination tables with deferrable constraints during initial snapshot. If you’ve created tables manually, ensure constraints are deferrable.
Debugging Steps
When encountering an issue, follow these steps systematically:
1. Check Mirror Status
curl http://localhost:8112/v1/mirrors/my-mirror \
-H "Authorization: Bearer <token>" | jq '.'Look for:
statusfield (should berunning)error_message(describes current error)error_count(number of consecutive errors)
2. Review Logs
curl "http://localhost:8112/v1/mirrors/my-mirror/logs?level=ERROR&limit=20" \
-H "Authorization: Bearer <token>" | jq '.logs[] | {created_at, message, details}'Error logs reveal:
- What operation failed
- Why it failed (connection, SQL error, timeout)
- When it started failing
3. Check Temporal UI
Navigate to http://localhost:8085 and:
- Find the workflow for your mirror
- Check workflow status (Running, Failed, Completed)
- Review activity failures and stack traces
- Examine workflow history for timing issues
4. Check Docker Logs
docker compose logs --tail=100 bunny-workerWorker logs show:
- Low-level errors not captured in API logs
- Panics or crashes
- Connection issues
- Temporal workflow errors
5. Verify Source Database
-- Check replication slot
SELECT * FROM pg_replication_slots
WHERE slot_name LIKE 'bunny_%';
-- Check publication
SELECT * FROM pg_publication_tables
WHERE pubname LIKE 'bunny_%';
-- Check WAL level
SHOW wal_level;
-- Check slot lag
SELECT
slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots
WHERE slot_name LIKE 'bunny_%';6. Verify Destination Database
-- Check if tables exist
SELECT schemaname, tablename
FROM pg_tables
WHERE schemaname = 'public';
-- Check row counts
SELECT
schemaname || '.' || tablename AS table,
n_tup_ins AS inserts,
n_tup_upd AS updates,
n_tup_del AS deletes
FROM pg_stat_user_tables
WHERE schemaname = 'public';
-- Check for locks
SELECT
relation::regclass AS table,
mode,
granted
FROM pg_locks
WHERE relation IS NOT NULL;Recovery Procedures
Stuck Mirror
Symptoms: Mirror status is running, but LSN not advancing, no recent logs.
Recovery:
Pause the mirror
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/pause \
-H "Authorization: Bearer <token>"Resume the mirror
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/resume \
-H "Authorization: Bearer <token>"If still stuck, use retry
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/retry \
-H "Authorization: Bearer <token>"Data Drift
Symptoms: Destination data doesn’t match source, row counts differ.
Recovery:
Resync the specific table:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/resync/public.users \
-H "Authorization: Bearer <token>"This truncates and re-snapshots only that table.
Resync truncates destination tables. Ensure you’re not losing data unique to the destination.
Schema Drift
Symptoms: Source and destination schemas don’t match (columns added/removed/changed).
Recovery:
curl -X POST http://localhost:8112/v1/mirrors/my-mirror/sync-schema \
-H "Authorization: Bearer <token>"Schema sync:
- Detects schema differences
- Generates and applies DDL on destination
- Preserves existing data
Schema sync drops and recreates the replication slot to ensure schema changes are detected properly.
Complete Failure
Symptoms: Mirror cannot be recovered through retry/resync.
Recovery:
Delete the mirror
curl -X DELETE http://localhost:8112/v1/mirrors/my-mirror \
-H "Authorization: Bearer <token>"This cleans up:
- Temporal workflow
- Replication slot on source
- Publication on source
- BunnyDB metadata
Verify cleanup on source
SELECT * FROM pg_replication_slots WHERE slot_name = 'bunny_slot_my_mirror';
-- Should return 0 rows
SELECT * FROM pg_publication WHERE pubname = 'bunny_pub_my_mirror';
-- Should return 0 rowsRecreate the mirror
curl -X POST http://localhost:8112/v1/mirrors \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"name": "my-mirror",
"source_peer": "source-db",
"destination_peer": "dest-db",
"table_mappings": [...]
}'Performance Issues
Slow Snapshot
Symptoms: Initial snapshot takes too long.
Diagnosis:
# Check snapshot progress in logs
curl "http://localhost:8112/v1/mirrors/my-mirror/logs?search=snapshot" \
-H "Authorization: Bearer <token>"Solutions:
-
Increase parallelism:
{ "snapshot_max_parallel_workers": 8, "snapshot_num_tables_in_parallel": 4 } -
Larger partitions:
{ "snapshot_num_rows_per_partition": 1000000 } -
Check database resources: CPU, memory, I/O on source and destination
-
Network bandwidth: Slow network between source and destination
High Replication Lag
Symptoms: LSN advancing but lagging far behind source.
Diagnosis:
Check lag on source:
SELECT
slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag
FROM pg_replication_slots;Solutions:
-
Increase batch size:
{ "cdc_batch_size": 50000 } -
Decrease sync interval:
{ "cdc_sync_interval_seconds": 30 } -
Scale worker resources: More CPU, RAM
-
Optimize destination: Add indexes, tune PostgreSQL settings
-
Check destination locks: Long-running transactions blocking inserts
Getting Help
If you’re still stuck after trying these troubleshooting steps:
-
Gather diagnostic information:
- Mirror status (GET /v1/mirrors/{name})
- Recent error logs (GET /v1/mirrors/{name}/logs?level=ERROR)
- Temporal workflow ID and status
- Docker worker logs
- Source database replication slot status
-
Check existing issues: Search GitHub issues for similar problems
-
Open an issue: Include all diagnostic information
-
Community support: Join discussions on GitHub
When reporting issues, always include BunnyDB version, PostgreSQL versions (source and dest), mirror configuration, and complete error messages.