MySQL to PostgreSQL Database Migration: A Practical Case Study
In this case study, learn how migrating MySQL to PostgreSQL boosted query speed, optimized JSON and JOINs, and improved scalability for an e-commerce platform.
Join the DZone community and get the full member experience.
Join For FreeMigrating a MySQL database to PostgreSQL is a complex process that involves schema conversion, data transfer, query optimization, and performance tuning. In this case study, we document a real-world migration of e-commerce platform's database from MySQL 8.0 to PostgreSQL 15. The study includes performance benchmarks before and after migration.
Project Overview
Industry | E-commerce |
Database Size | ~120 GB |
Number of Tables | 200+ |
Queries per Second (QPS) | ~350 |
Replication Setup | MySQL Master-Slave is switched to PostgreSQL Streaming Replication |
Primary Use Case | OLTP (Online Transaction Processing) |
Goal | Improve read-heavy performance, complex query execution and JSON processing |
Pre-Migration Performance Benchmarks
Before migration, we collected key performance metrics on MySQL 8.0 using sysbench
and EXPLAIN ANALYZE
.
1. Read Performance (SELECT Queries)
- Simple SELECT (1M rows) = 120 ms
- JOIN Query (10M rows) = 450 ms
- Aggregate Query (SUM) = 320 ms
- JSON Query (
JSON_EXTRACT
) = 600 ms
2. Write Performance (INSERT Queries)
- Single-row INSERT = 10 ms
- Bulk INSERT (10K rows) = 850 ms
Migration Steps
Schema Conversion
Schema conversion from MySQL and PostgreSQL includes safe types mapping, transforming default values and other column attributes with respect to difference in SQL syntax between two Database Management System (DBMS).
In MySQL:
CREATE TABLE orders (
id INT AUTO_INCREMENT PRIMARY KEY,
user_id INT NOT NULL,
total_price DECIMAL(10,2) NOT NULL,
created DATETIME DEFAULT CURRENT_TIMESTAMP,
updated DATETIME DEFAULT '0000-00-00 00:00:00',
comment LONGTEXT,
...
rawdata JSON
);
In PostgreSQL:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL,
total_price NUMERIC(10,2) NOT NULL,
created TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated TIMESTAMP DEFAULT NULL,
comment TEXT,
...
rawdata JSONB
);
Key Changes
INT AUTO_INCREMENT
is converted toSERIAL
DATETIME
is converted toTIMESTAMP
DECIMAL
is converted toNUMERIC
DEFAULT '0000-00-00 00:00:00'
is converted toDEFAULT NULL
JSON
is converted toJSONB
LONGTEXT
is converted toTEXT
Data Migration
We used MySQL-to-PostgreSQL converter for both schema and data migration. This is an efficient option of data transfer due to high speed multithreading techniques of bulk reading and writing data.
Migration Time: ~3.5 hours for 120 GB of data
Data Loss: 0% (Validated row counts post-migration)
Migration of Triggers
MySQL and PostgreSQL support triggers but differ in syntax and structure: PostgreSQL requires a separate trigger function instead of directly embedded trigger logic.
MySQL Version:
DELIMITER //
CREATE TRIGGER before_order_insert
BEFORE INSERT ON orders
FOR EACH ROW
BEGIN
SET NEW.created_at = NOW();
END;
//
DELIMITER ;
PostgreSQL Equivalent:
CREATE OR REPLACE FUNCTION before_order_insert_function() RETURNS TRIGGER AS $$
BEGIN
NEW.created_at := NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER before_order_insert
BEFORE INSERT ON orders
FOR EACH ROW
EXECUTE FUNCTION before_order_insert_function();
Validating MySQL to PostgreSQL Migration
Once MySQL database is migrated to PostgreSQL, validation ensures data integrity, query accuracy and application compatibility.
1. Schema Validation. Verify that the schema has been correctly migrated, ensuring:
- Tables, columns, and data types match.
- In MySQL:
SHOW CREATE TABLE my_table;
- In PostgreSQL:
- In MySQL:
SELECT table_name, column_name, data_type
FROM information_schema.columns
WHERE table_name = 'my_table';
- Indexes, constraints, and relationships are intact.
- In MySQL:
SHOW INDEX FROM my_table;
- In PostgreSQL:
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'my_table';
- In MySQL:
2. Data Integrity Validation. Ensure all records have been migrated correctly without corruption:
- Row Count Comparison by running the query
SELECT COUNT(*) FROM my_table;
against source and target tables. - Checksum Validation
- In MySQL:
SELECT MD5(GROUP_CONCAT(column1, column2 ORDER BY id)) FROM my_table;
- In PostgreSQL:
SELECT MD5(string_agg(column1 || column2, ',' ORDER BY id)) FROM my_table;
- In MySQL:
3. Queries/Views Validation. The only way to verify views after MySQL to PostgreSQL migration is comparing SELECT-statement of each view in the source and target databases with respect to differences between SQL dialects of the two DBMS. The task requires deep knowledge in database programming, some of the rules are explored in this article.
MySQL exposes list of all views in the database using the query:
SHOW FULL TABLES IN database_name WHERE TABLE_TYPE LIKE 'VIEW';
PostgreSQL can do the same via the query:
SELECT table_name FROM INFORMATION_SCHEMA.views;
Post-Migration Optimizations
The next crucial step is optimizing PostgreSQL for performance. PostgreSQL and MySQL have different optimization techniques, so tuning PostgreSQL properly ensures that your system runs efficiently.
- GIN index for JSON search that improved related query speeds by 4x compared to MySQL:
CREATE INDEX idx_orders_json ON orders USING GIN(rawdata);
- Enabled Parallel Query Execution in PostgreSQL:
SET max_parallel_workers_per_gather = 4;
- Tweaked Memory and Cache Settings:
shared_buffers = 4GB work_mem = 64MB effective_cache_size = 12GB
- Partitioned Large Tables for Better Query Performance:
CREATE TABLE orders_2023 PARTITION OF orders FOR VALUES FROM ('2023-01-01') TO ('2023-12-31');
- Optimized VACUUM and ANALYZE.
Post-Migration Performance Benchmarking
1. SELECT Query Performance
Query Type | MySQL (ms) | PostgreSQL(ms) | Performance Gain (%) |
Single SELECT | 120 ms | 105 ms | +12.5% |
JOIN Query (10M rows) | 450 ms | 280 ms | +37.8% |
Aggregate Query (SUM) | 320 ms | 210 ms | +34.4% |
JSON Query (JSON_EXTRACT) | 600 ms | 150 ms | +75% |
2. INSERT Query Performance
Query Type | MySQL (ms) | PostgreSQL(ms) | Performance Gain (%) |
Single-row INSERT | 10 ms | 12 ms | -20% (Slightly slower) |
Bulk INSERT (10K rows) | 850 ms | 600 ms | +29.4% |
Summary
- PostgreSQL significantly outperformed MySQL in JOINs, aggregations, and JSON queries
- Write performance for bulk inserts improved, but single-row inserts were slightly slower due to Multi-Version Concurrency Control (MVCC).
In case you missed my previous articles:
Published at DZone with permission of Vlad Bilyak. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments