Concepts
This page explains the core concepts and terminology used throughout BunnyDB.
Peers
Peers are database connection configurations that represent either a source or destination PostgreSQL database.
Fields
| Field | Type | Description |
|---|---|---|
name | string | Unique identifier for the peer (e.g., “prod-primary”, “analytics-replica”) |
host | string | Database hostname or IP address |
port | integer | Database port (default: 5432) |
user | string | PostgreSQL username |
password | string | PostgreSQL password (encrypted at rest) |
database | string | Target database name |
ssl_mode | string | SSL mode: disable, require, verify-ca, verify-full |
Example
{
"name": "prod-primary",
"host": "db.example.com",
"port": 5432,
"user": "replication_user",
"password": "secure_password",
"database": "production",
"ssl_mode": "verify-full"
}Peer credentials are stored in the catalog database. Use strong passwords and enable SSL for production deployments.
Requirements
Source Peer:
- Must have
wal_level = logicalinpostgresql.conf - User must have
REPLICATIONprivilege - Sufficient
max_replication_slotsandmax_wal_senders
Destination Peer:
- User must have
CREATEprivileges on target schemas - Sufficient disk space for replicated data
Mirrors
Mirrors are replication pipelines that continuously sync data from a source peer to a destination peer.
Fields
| Field | Type | Description |
|---|---|---|
name | string | Unique mirror identifier |
source_peer | string | Name of the source peer |
destination_peer | string | Name of the destination peer |
table_mappings | array | List of tables to replicate (see below) |
do_initial_snapshot | boolean | Whether to copy existing data before CDC |
max_batch_size | integer | Maximum changes per batch (default: 1000) |
idle_timeout_seconds | integer | Apply batch after this many seconds of inactivity (default: 60) |
replicate_indexes | boolean | Whether to replicate indexes (default: true) |
Example
{
"name": "prod_to_analytics",
"source_peer": "prod-primary",
"destination_peer": "analytics-replica",
"table_mappings": [
{
"source_table": "public.users",
"destination_table": "public.users"
},
{
"source_table": "public.orders",
"destination_table": "public.orders",
"exclude_columns": ["credit_card_number"]
}
],
"do_initial_snapshot": true,
"max_batch_size": 2000,
"idle_timeout_seconds": 30,
"replicate_indexes": true
}Each mirror runs as a long-lived Temporal workflow. You can have multiple mirrors per peer, allowing selective replication of different table sets.
Table Mappings
Table Mappings define which tables to replicate and how they map from source to destination.
Fields
| Field | Type | Description |
|---|---|---|
source_table | string | Fully-qualified source table name (e.g., public.users) |
destination_table | string | Fully-qualified destination table name |
partition_key | string | Column to use for partitioned replication (optional) |
exclude_columns | array | List of columns to exclude from replication |
Examples
Simple mapping (same table name):
{
"source_table": "public.users",
"destination_table": "public.users"
}Different schema:
{
"source_table": "public.users",
"destination_table": "analytics.dim_users"
}Exclude sensitive columns:
{
"source_table": "public.users",
"destination_table": "public.users",
"exclude_columns": ["password_hash", "ssn"]
}Excluded columns must be nullable or have default values on the destination. BunnyDB will not replicate data for these columns.
LSN (Log Sequence Number)
LSN is PostgreSQL’s internal identifier for a position in the Write-Ahead Log (WAL). It’s a 64-bit integer that increases monotonically as transactions are written.
Format
LSNs are typically displayed in hex format: 0/1A2B3C4D
- First part: WAL segment file number
- Second part: Byte offset within the segment
BunnyDB Usage
BunnyDB tracks the last_lsn for each mirror to remember where replication should resume from:
- Initial snapshot: Captures LSN at start of snapshot transaction
- CDC sync: Updates
last_lsnafter each successful batch application - Resume: Starts replication from
last_lsnwhen resuming a paused mirror
Viewing LSN
Get the current LSN on source database:
SELECT pg_current_wal_lsn();Get the mirror’s last processed LSN:
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8112/v1/mirrors/my_mirror | jq .last_lsnReplication lag can be calculated by comparing the source’s current LSN with the mirror’s last_lsn. Large differences indicate the mirror is falling behind.
Batches
Batches are groups of WAL changes that are applied to the destination database together in a single transaction.
Why Batching?
- Performance: Reduces transaction overhead compared to applying each change individually
- Consistency: Ensures related changes are applied atomically
- Ordering: Maintains transaction boundaries when possible
Batch Controls
max_batch_size:
- Maximum number of changes in a single batch
- Larger batches = higher throughput, more memory usage
- Default: 1000
idle_timeout_seconds:
- Apply batch after this many seconds of inactivity
- Prevents lag during low-traffic periods
- Default: 60 seconds
Example Batch Application
Suppose the source database receives these operations:
INSERT INTO users (id, email) VALUES (1, 'alice@example.com');
INSERT INTO users (id, email) VALUES (2, 'bob@example.com');
UPDATE users SET email = 'alice.new@example.com' WHERE id = 1;
DELETE FROM users WHERE id = 2;BunnyDB will batch these into a single transaction on the destination:
BEGIN;
SET CONSTRAINTS ALL DEFERRED; -- Defer FK checks
INSERT INTO users (id, email) VALUES (1, 'alice@example.com');
INSERT INTO users (id, email) VALUES (2, 'bob@example.com');
UPDATE users SET email = 'alice.new@example.com' WHERE id = 1;
DELETE FROM users WHERE id = 2;
COMMIT; -- FK constraints validated hereDeferred constraints are critical for maintaining referential integrity when operations within a batch reference each other.
Replication Slots
Replication Slots are PostgreSQL’s mechanism for preventing WAL segments from being deleted before a replica has consumed them.
Naming Convention
BunnyDB creates slots with the format: bunny_slot_\{mirror_name\}
Example: For a mirror named prod_to_analytics, the slot is bunny_slot_prod_to_analytics.
How They Work
- When a mirror is created, BunnyDB creates a replication slot on the source
- PostgreSQL tracks the slot’s LSN position
- WAL segments before the slot’s position are retained, even if
wal_keep_sizeis exceeded - When the mirror consumes WAL, the slot position advances
- PostgreSQL can then clean up consumed WAL segments
Viewing Slots
On the source database:
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots
WHERE slot_name LIKE 'bunny_slot_%';Inactive slots can cause WAL accumulation. If a mirror is paused for too long, disk space may fill up. Monitor slot lag regularly in production.
Cleanup
Slots are automatically dropped when:
- Mirror is terminated via
DELETE /v1/mirrors/:name TERMINATEsignal is sent
If a mirror crashes and the slot remains, manually drop it:
SELECT pg_drop_replication_slot('bunny_slot_my_mirror');Publications
Publications define which tables are available for logical replication on the source database.
Naming Convention
BunnyDB creates publications with the format: bunny_pub_\{mirror_name\}
Example: For a mirror named prod_to_analytics, the publication is bunny_pub_prod_to_analytics.
How They Work
- BunnyDB creates a publication containing the replicated tables
- The publication defines which DML operations are captured (INSERT, UPDATE, DELETE)
- Logical replication decoding uses the publication to filter WAL records
- Only changes to published tables are sent to the replication slot
Viewing Publications
On the source database:
-- List publications
SELECT pubname, puballtables
FROM pg_publication
WHERE pubname LIKE 'bunny_pub_%';
-- List tables in a publication
SELECT schemaname, tablename
FROM pg_publication_tables
WHERE pubname = 'bunny_pub_my_mirror';Cleanup
Publications are automatically dropped when the mirror is terminated.
Mirror States
Mirrors transition through several states during their lifecycle:
| State | Description |
|---|---|
CREATED | Mirror configuration created, workflow not yet started |
SETTING_UP | Creating replication slot and publication on source |
SNAPSHOT | Copying initial data (if do_initial_snapshot is true) |
RUNNING | Actively replicating changes via CDC |
PAUSED | Replication paused by user, slot retained |
PAUSING | Transitioning from RUNNING to PAUSED |
FAILED | Workflow encountered an unrecoverable error |
RESYNCING | Full resync in progress (all tables) |
SYNCING_SCHEMA | Applying schema changes (DDL sync) |
TERMINATING | Cleaning up slot/publication before shutdown |
TERMINATED | Mirror stopped, slot/publication dropped |
State Transitions
CREATED → SETTING_UP → SNAPSHOT → RUNNING
↓ ↓
FAILED PAUSED
↓
RUNNING
↓
RESYNCING → RUNNING
↓
SYNCING_SCHEMA → RUNNING
↓
TERMINATING → TERMINATEDUse GET /v1/mirrors/:name to check the current state and any error messages.
Signals
Signals are Temporal’s mechanism for sending commands to running workflows without restarting them.
Available Signals
| Signal | Description | Endpoint |
|---|---|---|
PAUSE | Pause replication, retain slot | POST /v1/mirrors/:name/pause |
RESUME | Resume replication from last LSN | POST /v1/mirrors/:name/resume |
TERMINATE | Stop mirror and drop slot/publication | DELETE /v1/mirrors/:name |
RESYNC | Full resync of all tables | POST /v1/mirrors/:name/resync |
RESYNC_TABLE | Resync a single table | POST /v1/mirrors/:name/resync (with table_name) |
RETRY_NOW | Bypass Temporal backoff, retry immediately | POST /v1/mirrors/:name/retry |
SYNC_SCHEMA | Apply pending schema changes | POST /v1/mirrors/:name/sync-schema |
Signal Semantics
- Signals are idempotent: Sending the same signal multiple times is safe
- Signals are asynchronous: The API returns immediately; check mirror state for completion
- Signals are queued: If a signal is already in progress, new signals are rejected with HTTP 409
Example: Pause and Resume
# Pause mirror
curl -X POST http://localhost:8112/v1/mirrors/my_mirror/pause \
-H "Authorization: Bearer $TOKEN"
# Verify state
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8112/v1/mirrors/my_mirror | jq .status
# Output: "PAUSED"
# Resume mirror
curl -X POST http://localhost:8112/v1/mirrors/my_mirror/resume \
-H "Authorization: Bearer $TOKEN"
# Verify state
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8112/v1/mirrors/my_mirror | jq .status
# Output: "RUNNING"Signals are handled within the workflow loop, so there may be a short delay (up to idle_timeout_seconds) before the signal takes effect.
Resync Strategies
When resyncing tables, BunnyDB supports two strategies:
Truncate Strategy
Pros:
- Simpler implementation
- Lower disk space usage
- Faster for small tables
Cons:
- Destination table is unavailable during resync (downtime)
- Queries fail until resync completes
Process:
- Drop all foreign keys referencing the table
TRUNCATEdestination tableCOPYdata from source- Rebuild indexes on destination
- Recreate foreign keys
- Validate foreign keys
When to use:
- Non-production environments
- Maintenance windows are acceptable
- Small tables (resync completes quickly)
Swap Strategy
Pros:
- Zero downtime (queries continue during resync)
- Atomic cutover
- Safe rollback if resync fails
Cons:
- Requires ~2x disk space temporarily
- Slightly more complex
- Longer index build time (on shadow table)
Process:
- Create shadow table:
\{table_name\}_resync COPYdata from source to shadow table- Build indexes on shadow table
- Atomic rename in a transaction:
BEGIN; ALTER TABLE {table_name} RENAME TO {table_name}_old; ALTER TABLE {table_name}_resync RENAME TO {table_name}; COMMIT; - Drop old table:
\{table_name\}_old
When to use:
- Production environments
- Zero-downtime requirement
- Large tables (downtime would be significant)
During swap resync, ensure sufficient disk space for the shadow table plus indexes. Monitor with df -h on the destination host.
Choosing a Strategy
Specify the strategy in the resync request:
# Truncate strategy (simpler, has downtime)
curl -X POST http://localhost:8112/v1/mirrors/my_mirror/resync \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"strategy": "truncate"}'
# Swap strategy (zero-downtime)
curl -X POST http://localhost:8112/v1/mirrors/my_mirror/resync \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"strategy": "swap"}'Default: If not specified, BunnyDB uses swap for production safety.
Next Steps
Now that you understand BunnyDB’s core concepts:
- Review the Architecture page for how these concepts work together
- Explore the API Reference for detailed endpoint documentation
- Check out Guides for common operational scenarios