Pg Chameleon Versions Save

MySQL to PostgreSQL replica system

v2.0.19

1 year ago

This maintenance release adds the following bugfix and improvements.

Merge pull request #144 adding mysql-replication support for PyMySQL>0.10.0 was introduced in v0.22 Adds support for fillfactor when running init_replica, it's now possible to specify the fillfactor for the tables when running init_replica. Useful to mitigate bloat in advance when replicating/migrating from MySQL.

Improve logging on discarded rows, now the discarded row image is displayed in the log.

Add distinct on group concat when collecting foreign keys metadata to avoid duplicate fields in the foreign key definition.

Use mysql-replication>=0.31, this fix the crash when replicating from MariaDB introduced in mysql-replication 0.27

Changelog from v2.0.18

  • Merge pull request #144, mysql-replication support for PyMySQL>0.10.0 was introduced in v0.22
  • add support for fillfactor when running init_replica
  • improve logging on discarded rows
  • add distinct on group concat when collecting foreign keys
  • use mysql-replication>=0.31, fix for crash when replicating from MariaDB

v2.0.18

2 years ago

This maintenance release adds the following bugfix and improvements.

Adds a new method copy_schema to copy only the schema without the data (EXPERIMENTAL).

Adds the support for the ON DELETE and ON UPDATE clause when creating the foreign keys in PostgreSQL with detach_replica and copy_schema.

When running init_replica or copy_schema the names for the indices and foreign keys are preserved. Only if there is any duplicate name then pg_chameleon will ensure that the names on PostgreSQL are unique within the same schema.

Adds a workaround for a regression introduced in mysql-replication by forcing the version to be lesser than 0.27.

Change the data type for the identifiers stored into the replica schema to varchar(64)

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

Changelog from v2.0.17

  • Support the ON DELETE and ON UPDATE clause when creating the foreign keys in PostgreSQL
  • change logic for index and foreign key names by managing only duplicates within same schema
  • use mysql-replication<0.27 as new versions crash when receiving queries
  • add copy_schema method for copying only the schema without data (EXPERIMENTAL)
  • change type for identifiers in replica schema to varchar(64)

v2.0.17

2 years ago

This maintenance release adds the following bugfix.

Fix the wrong order in copy data/create indices when keep_existing_schema is No.

Previously the indices were created before the data was loaded into the target schema with great performance degradation.

This fix applies only if the parameter keep_existing_schema is set to No.

Add the collect for unique constraints when keep_existing_schema is Yes.

Previously the unique constraint were not collected or dropped if defined as constraints instead of indices.

This fix applies only if the parameter keep_existing_schema is set to Yes.

This release adds the following changes:

  • Remove argparse from the requirements as now it's part of the python3 core dist
  • Remove check for log_bin when we replicate from Aurora MySQL
  • Manage different the different behaviour in pyyaml to allow pg_chameleon to be installed as rpm in centos 7 via pgdg repository

This release works with Aurora MySQL. However Aurora MySQL 5.6 segfaults when FLUSH TABLES WITH READ LOCK is issued.

The replica is tested on Aurora MySQL 5.7.

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

Changelog from v2.0.16

  • Remove argparse from the requirements
  • Add the collect for unique constraints when keep_existing_schema is Yes
  • Fix wrong order in copy data/create indices when keep_existing_schema is No
  • Remove check for log_bin we are replicating from Aurora MySQL
  • Manage different the different behaviour in pyyaml to allow pg_chameleon to be installed as rpm in centos 7

v2.0.16

3 years ago

2.0.16

This maintenance release fix a crash in init_replica caused by an early disconnection during the fallback on insert. This caused the end of transaction to crash aborting the init_replica entirely.

Changelog from v2.0.15

  • Fix for issue #126 init_replica failure with tables on transactional engine and invalid data

v2.0.15

3 years ago

This maintenance release adds the support for reduced lock if MySQL engine is transactional, thanks to @rascalDan

The init_replica process checks whether the engine for the table is transactional and runs the initial copy within a transaction. The process still requires a FLUSH TABLES WITH READ LOCK but the lock is released as soon as the transaction snapshot is acquired. This improvement allows pg_chameleon to run agains primary databases with minimal impact during the init_replica process.

The python-mysql-replication requirement is now changed to version >=0.22. This release adds support for PyMySQL >=0.10.0. The requirement for PyMySQL to version <0.10.0 is therefore removed from setup.py.

Changelog from v2.0.14

  • Support for reduced lock if MySQL engine is transactional, thanks to @rascalDan
  • setup.py now requires python-mysql-replication to version 0.22 which adds support for PyMySQL >=0.10.0
  • removed PyMySQL requirement <0.10.0 from setup.py
  • prevent pg_chameleon to run as root

v2.0.14

3 years ago

This maintenance release improves the support for spatial datatypes. When postgis is installed on the target database then the spatial data types point,geometry,linestring,polygon, multipoint, multilinestring, geometrycollection are converted to geometry and the data is replicated using the Well-Known Binary (WKB) Format. As the MySQL implementation for WKB is not standard pg_chameleon removes the first 4 bytes from the decoded binary data before sending it to PostgreSQL.

When keep_existing_schema is set to yes now drops and recreates indices, and primary keys during the init_replica process. The foreign keys are dropped as well and recreated when the replica reaches the consistent status. This way the init_replica may complete successfully even when there are foreign keys in place and with the same speed of the usual init_replica.

The setup.py now forces PyMySQL to version <0.10.0 because it breaks the python-mysql-replication library (issue #117).

Thanks to @porshkevich which fixed issue #115 by trim the space from PK index name.

This release requires a replica catalogue upgrade, therefore is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump as a good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure can't upgrade the replica catalogue because of running or errored replicas is it possible to reset the statuses by using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible then you can downgrade pgchameleon to the previous version. Please note that you may need to install manually PyMySQL to fix the issue with the version 0.10.0.

pip install pg_chameleon==2.0.13

pip install "PyMySQL<0.10.0"

Changelog from v2.0.13

  • Add support for spatial data types (requires postgis installed on the target database)
  • When keep_existing_schema is set to yes now drops and recreates indices, and constraints during the init_replica process
  • Fix for issue #115 thanks to @porshkevich
  • setup.py now forces PyMySQL to version <0.10.0 because it breaks the python-mysql-replication library (issue #117)

v2.0.13

3 years ago

This maintenance release adds the EXPERIMENTAL support for Point datatype thanks to the contribution by @jovankricka-everon.

The support is currently limited to only the POINT datatype with hardcoded stuff to keep the init_replica and the replica working. However as this feature is related with PostGIS, the next point release will rewrite this part of code using a more general approach.

The release adds the keep_existing_schema parameter in the MySQL source type. When set to Yes init_replica,refresh_schema and sync_tables do not recreate the affected tables using the data from the MySQL source. Instead the existing tables are truncated and the data is reloaded.

A REINDEX TABLE is executed in order to have the indices in good shape after the reload. The next point release will very likely improve the approach on the reload and reindexing.

When keep_existing_schema is set to Yes the parameter grant_select_to have no effect.

From this release the codebase switched from tabs to spaces, following the guidelines in PEP-8.

Changelog from v2.0.12

  • EXPERIMENTAL support for Point datatype - @jovankricka-everon
  • Add keep_existing_schema in MySQL source type to keep the existing scema in place instead of rebuilding it from the mysql source
  • Change tabs to spaces in code

v2.0.12

4 years ago

This maintenance release fixes the issue #96 where the replica initialisation failed on MySQL 8 because of the wrong field names pulled out from the information_schema. Thanks to @daniel-qcode for contributing with his fix.

The configuration and SQL files are now moved inside into the directory pg_chameleon. This change simplifies the setup.py file and allow pg_chameleon to be built as source and wheel package.

As python 3.4 has now reached its end-of-life and has been retired the minimum requirement for pg_chameleon has been updated to Python 3.5.

Changelog from v2.0.11

  • Fixes for issue #96 thanks to @daniel-qcode
  • Change for configuration and SQL files location
  • Package can build now as source and wheel
  • The minimum python requirements now is 3.5

v2.0.11

4 years ago

This maintenance release fixes few things. As reported in #95 the yaml files were not completely valid. @rebtoor fixed them.

@clifff made a pull request to have the start_replica running in foreground when log_file set to stdout. Previously the process remained in background with the log set to stdout.

As Travis seems to break down constantly the CI configuration is disabled until a fix or a different CI is found .

Finally the method which loads the yaml file is now using an explicit loader as required by the new PyYAML version.

Previously with newer version of PyYAML there was a warning emitted by the library because the default loader is unsafe.

Changelog from v2.0.10

  • Fix wrong formatting for yaml example files. @rebtoor
  • Make start_replica run in foreground when log_file == stdout . @clifff
  • Travis seems to break down constantly, Disable the CI until a fix is found. Evaluate to use a different CI.
  • Add the add loader to yaml.load as required by the new PyYAML version.

v2.0.10

5 years ago

This maintenance release fixes a regression caused by the new replay function with PostgreSQL 10. The unnested primary key was put in cartesian product with the json elements generating NULL identifiers which made the subsequent format function to fail.

This release adds a workaround for decoding the keys in the mysql's json fields. This allows the sytem to replicate the json data type as well.

The command enable_replica fixes a race condition when the maintenance flag is not returned to false (e.g. an application crash during the maintenance run) allowing the replica to start again.

The tokeniser for the CHANGE statement now parses the tables in the form of schema.table. However the tokenised schema is not used to determine the query's schema because the __read_replica_stream method uses the schema name pulled out from the mysql's binlog.

As this change requires a replica catalogue upgrade is very important to follow the upgrade instructions provided below.

  • If working via ssh is suggested to use screen or tmux for the upgrade
  • Stop all the replica processes with chameleon stop_all_replicas --config <your_config>
  • Take a backup of the schema sch_chameleon with pg_dump for good measure.
  • Install the upgrade with pip install pg_chameleon --upgrade
  • Check if the version is upgraded with chameleon --version
  • Upgrade the replica schema with the command chameleon upgrade_replica_schema --config <your_config>
  • Start all the replicas.

If the upgrade procedure refuses to upgrade the catalogue because of running or errored replicas is possible to reset the statuses using the command chameleon enable_replica --source <source_name>.

If the catalogue upgrade is still not possible downgrading pgchameleon to the previous version. E.g. pip install pg_chameleon==2.0.9 will make the replica startable again.

Changelog from v2.0.9

  • Fix regression in new replay function with PostgreSQL 10
  • Convert to string the dictionary entries pulled from a json field
  • Let enable_replica to disable any leftover maintenance flag
  • Add capture in CHANGE for tables in the form schema.table