Background (partition migration with logical replication)
Large PostgreSQL tables eventually run into performance problems. As rows accumulate, queries slow down, indexes bloat, and retention policies become harder to enforce.
In one of our projects, we had a table called analytics.user_activity that logged every user action across an application. It had grown to hundreds of millions of rows, and queries filtering by time (event_time) or organization (org_id) were becoming painfully slow.
Instead of archiving or purging data, we chose to partition the table by event time. But how could we restructure such a massive table without major downtime?
The solution: partition migration with logical replication.
Why Partition?
Partitioning makes sense for append-only or time-series workloads.
Benefits:
-
Faster queries: PostgreSQL prunes irrelevant partitions automatically.
-
Simplified retention: Drop or detach old partitions instantly.
-
Efficient indexing: Smaller indexes per partition.
-
Scalability: Maintenance (VACUUM, ANALYZE, index rebuilds) stays lightweight.
Challenge: Migrating with Minimal Downtime
The biggest problem when retrofitting partitioning onto a large production table is how to move existing data.
- A CREATE TABLE … PARTITION BY … can’t just be applied to an existing table.
- Copying all rows (INSERT INTO new SELECT * FROM old) can take hours and block writes.
- If partitions are created incorrectly, new data can flow into the default partition, breaking retention and performance.
That’s where logical replication comes in. It allows you to replicate data from the old table into the new partitioned one, while keeping them in sync until cutover.
Migration Strategy
- Keep the old monolithic table running while building a new partitioned version.
- Set up logical replication to stream changes from old to new.
- Backfill existing data into the partitioned structure.
- Cut over applications to use the new table with minimal downtime.
Step-by-Step Migration
1. Install pg_partman
We use pg_partman to automate partition creation and retention policies.
1 2 | CREATE SCHEMA partman; CREATE EXTENSION pg_partman SCHEMA partman; |
2. Create Partitioned Table
1 2 3 4 5 6 7 8 9 10 | CREATE TABLE analytics.user_activity ( activity_id bigserial PRIMARY KEY, user_id bigint NOT NULL, org_id bigint NOT NULL, action text NOT NULL, event_time timestamptz NOT NULL, source text, payload jsonb, inserted_at timestamptz DEFAULT now() ) PARTITION BY RANGE (event_time); |
3. Configure pg_partman
Find the earliest timestamp in the old table:
1 | SELECT MIN(event_time) FROM old_user_activity; |
Start partitions at that timestamp:
1 2 3 4 5 6 7 8 | SELECT partman.create_parent( p_parent_table := 'analytics.user_activity', p_control := 'event_time', p_type := 'range', p_interval := '1 day', p_premake := 14, p_start_partition := '2025-06-01' ); |
Pre-generate historical partitions:
1 2 3 4 5 | SELECT partman.create_partition_time( 'analytics.user_activity', '2025-06-01', (now()::date + interval '30 days')::date::text ); |
4. Set Up Logical Replication
On source (old table DB):
1 | CREATE PUBLICATION pub_user_activity FOR TABLE old_user_activity; |
On target (new partitioned DB):
1 2 3 | CREATE SUBSCRIPTION sub_user_activity CONNECTION 'host=source-db port=5432 dbname=mydb user=replicator password=xxxx' PUBLICATION pub_user_activity; |
Now inserts/updates/deletes from old_user_activity will stream into analytics.user_activity.
5. Recreate Indexes
Indexes must be applied at the parent level, so pg_partman propagates them to new partitions:
1 2 3 | CREATE INDEX idx_activity_org_time ON analytics.user_activity (org_id, event_time); CREATE INDEX idx_activity_user_action ON analytics.user_activity (user_id, action); CREATE INDEX idx_activity_payload_attr ON analytics.user_activity ((payload->>'session_id')); |
Apply them to historical partitions too:
1 | SELECT partman.apply_indexes('analytics.user_activity'); |
6. Validate
- Compare row counts:
1 | SELECT (SELECT count(*) FROM old_user_activity) AS old_count, (SELECT count(*) FROM analytics.user_activity) AS new_count; |
- Check partition distribution:
1 | \dt+ analytics.user_activity* |
- Test future inserts:
1 2 | INSERT INTO analytics.user_activity (user_id, org_id, action, event_time, source) VALUES (12345, 77, 'login', '2025-08-21 00:05:00+00', 'web'); |
1 | SELECT * FROM analytics.user_activity_p20250821; |
Retention with Detached Partitions
Instead of dropping, we configured Partman to detach partitions older than 90 days. This keeps them queryable for audits but excludes them from normal queries:
1 2 3 4 | UPDATE partman.part_config SET retention = '90 days', retention_keep_table = true WHERE parent_table = 'analytics.user_activity'; |
1 | SELECT partman.run_maintenance('analytics.user_activity'); |
Lessons Learned
-
Always check the earliest event_time before creating partitions.
-
Use logical replication to avoid downtime during migration.
-
Create indexes on the parent table — let pg_partman propagate them.
-
Detached partitions provide flexible retention without data loss.
Conclusion
By combining logical replication with pg_partman, we migrated analytics.user_activity — a massive time-series style table — into a partitioned structure with near-zero downtime.
The result: queries sped up, retention became trivial, and operational overhead dropped significantly.
Partitioning isn’t just for greenfield projects — with the right approach, you can retrofit it onto existing production workloads safely.