Galaxycdc Versions Save

polardbx-cdc is a core component of PolarDB-X which is responsible for global binary log generation, publication and subscription.

polardbx-cdc-5.4.19

1 month ago

新功能&优化

  1. 主从DDL复制新增支持针对function类型的复制
  2. 主从DDL复制新增支持针对sequence类型的复制
  3. 主从DDL复制新增支持针对view类型的复制
  4. 主从DDL复制新增支持针对procedure类型的复制
  5. 主从DDL复制新增支持针对账号、角色和权限类型的复制
  6. 主从DDL复制新增支持针对alter tablegroup类型的复制
  7. 主从DDL复制新增支持针对alter index类型的复制
  8. 主从DDL复制新增支持针对多流binlog场景下的一致性协调对齐复制
  9. 主从复制新增支持按时间戳创建复制链路能力
  10. 主从复制新增通过SQL命令校验上下游数据一致性的能力
  11. 主从复制新增从实例角色,Slave集群具备实例级只读的能力
  12. 主从复制性能优化,单链路rps从3w/s提升至5w/s
  13. 主从复制完善基于server_id的双向复制能力,新增命令行指定过滤规则和server_id异常自检能力
  14. 主从复制DML写入支持where条件全镜像匹配能力
  15. 多流binlog支持binlog stream name和user name绑定,使用绑定账号执行binlog相关SQL时,无需增加with选项

Bug修复

  1. 修复multi alter add/drop column语句导致元数据列序错误,进而导致的binlog链路中断问题
  2. 修复multi alter add/drop/rename index语句导致元数据丢失索引,进而导致binlog中记录错误DDL SQL的问题
  3. 修复meta_build_physical_ddl_sql_blacklist_regex参数存在配置缺陷,导致alter ddl sql包含user关键字时ddl被过滤问题
  4. 修复drop index ddl sql无法输出到binlog问题

New Features & Optimizations

  1. Master-slave DDL replication now supports the replication of the function type.
  2. Master-slave DDL replication now supports the replication of the sequence type.
  3. Master-slave DDL replication now supports the replication of the view type.
  4. Master-slave DDL replication now supports the replication of the procedure type.
  5. Master-slave DDL replication now supports the replication of accounts, roles, and permission types.
  6. Master-slave DDL replication now supports the replication of the alter table group type.
  7. Master-slave DDL replication now supports the replication of the alter index type.
  8. Master-slave DDL replication now supports consistency coordination alignment replication in scenarios with multiple stream binlogs.
  9. Master-slave replication now supports creating replication links by timestamp.
  10. Master-slave replication now supports the ability to verify data consistency between upstream and downstream through SQL commands.
  11. Master-slave replication now adds instance-level read-only capabilities for slave clusters from the instance role.
  12. Performance optimization for master-slave replication, with single-link rps increased from 30k/s to 50k/s.
  13. Master-slave replication enhances two-way replication capability based on server_id, adding command-line specified filtering rules, and server_id anomaly self-check capability.
  14. Master-slave replication DML write supports full mirror matching capability with where conditions.
  15. Multi-stream binlog supports binding of binlog stream name and user name; executing binlog-related SQL with the bound account does not require an additional with option.

Bug Fixes

  1. Fixed the issue where multi alter add/drop column statements caused metadata column sequence errors, which led to the interruption of binlog links.
  2. Fixed the issue where multi alter add/drop/rename index statements caused loss of metadata indexes, which led to errors in DDL SQL recordings in binlogs.
  3. Fixed the configuration defect in the meta_build_physical_ddl_sql_blacklist_regex parameter, which caused DDLs containing the user keyword to be filtered when altering DDL SQL.
  4. Fixed the problem that drop index ddl sql could not be output to binlog.

galaxycdc-5.4.15-16792770

1 year ago

发布说明

重要功能新增

新增binlog多流能力,实时生成多条逻辑日志流,提供更强的分布式扩展能力。 新增MySQL一键导入能力,用户可通过SQL快速导入主MySQL实例的存量结构和数据,建立起主备完全一致的同步链路。

特性更新

新增多级归并能力,实现DN节点线性增加时性能无明显衰减。 新增定时构建全量元数据Snapshot能力,解决历史DDL打标记录过度膨胀问题。 新增基于Recover TSO的恢复方式,解决本地以及远端存储无Binlog文件场景下的集群恢复问题。 新增透明消费能力,下游通过dump协议可直接消费归档到OSS的binlog文件。 优化事务落盘机制,内存够用的情况下不强制落盘。 优化从OSS下载binlog文件的方式,支持多线程并行下载,提升实例恢复速度。

问题修复

修复Dumper进程初次启动时,计算的延迟时间有误的问题。 修复CDC集群发生重启后,下游订阅可能发生dump超时的问题。 修复触发binlog event数据整形时的兼容性和正确性问题,进一步提升DDL变更期间CDC链路的稳定性。 修复消费程序在进行binlog dump前,未发送COM_REGISTER_SLAVE引起CDC主动断连的问题。 修复连续增加主键和删除主键,导致CDC链路中断的问题。 元数据增加数据类型校验,解决因物理表元数据不一致导致整形生效,触发binlog数据错误的问题。 优化dumper Master的选举策略,解决当主备同步有延迟时,进度落后的dumper被选举为master的问题。

RELEASE NOTE

Important New Feature

Added the capability of multiple binlog streams, generating multiple logical log streams in real-time, and providing stronger distributed expansion capabilities. Added one-click MySQL import capability, users can quickly import the structural and data inventory of the main MySQL instance through SQL, and establish a fully consistent synchronization link between the primary and backup.

Feature enhancement

Added multi-level merging capability to achieve linear increase in performance without significant attenuation when DN nodes increase. Added the ability to schedule the construction of full metadata snapshots to solve the problem of excessive expansion of historical DDL labeling records. Added recovery methods based on Recover TSO to solve cluster recovery problems in scenarios where there are no Binlog files stored locally or remotely. Added transparent consumption capability, downstream can directly consume binlog files archived to OSS through the dump protocol. Optimized the transaction write-to-disk mechanism and will not force disk write if there is enough memory available. Optimized the way binlog files are downloaded from OSS, supporting multi-threaded parallel downloads to improve instance recovery speed.

Bugfix

Fixed the problem of incorrect calculation of delay time when the Dumper process is started for the first time. Fixed the problem of downstream subscription timeout after CDC cluster restart. Fixed compatibility and correctness issues when triggering binlog event data formatting, further improving the stability of the CDC link during DDL changes. Fixed the problem of CDC actively disconnecting due to not sending COM_REGISTER_SLAVE before binlog dump in the consumer program. Fixed the problem of CDC link interruption caused by continuous addition and deletion of primary keys. Added data type validation to metadata to solve the problem of triggering binlog data errors due to inconsistent physical table metadata. Optimized the election strategy of dumper Master to solve the problem of progress-lagging dumper being elected as master when there is a delay in primary-secondary synchronization.

galaxycdc-5.4.15

1 year ago

galaxycdc-5.4.15-alpha

1 year ago

RELEASE NOTE

Important New Feature

  • Add backup capability:global binlog files can be uploaded to OSS in real time and restored to local from OSS for recovery
  • Greatly improved performance:the maximum EPS capacity is increased from 100w/s to 200w/s, and the maximum write throughput capacity is increased from 250M/s to 500M/s

Feature enhancement

  • Add built-in automatic cleaning capability:binlog file on the local disk can be automatically cleared periodically when the total file size exceeds a specified threshold
  • Reduced memory usage by 40%:all kinds of optimization such as Metadata persistence, buffer size optimization, memory/disk swap policy optimization, and serialization/deserialization optimization
  • More Strong data consistency: During DDL changes, the data reformat module supports reformating the data type and precision of binlog events
  • New historical record clearing capability: Supports periodic clearing of ddl marking information and cluster scheduling history information
  • Optimized log clearing policy: Optimized the clearing policy for the memory dump files (*.hprof) to reduce the threshold of the number of reserved files and avoid the risk of disk explosion
  • Optimized the health checking policy for cdc processes: Change the health checking method from the 'jps' command to 'ps -ef' to solve the host cpu jitter problem caused by jps in high load scenarios
  • Optimized CDC recovery policy: fix the issue that When a 2PC transaction spans multiple DN binlog files during recovery, data loss may occur in extreme cases
  • Add hot effect capability for parameters: automatically scans the binlog_system_config configuration table and config.properties file, and automatically resets memory parameter values
  • Optimized performance for binlog data filtering : optimize the TxnBuffer data structure to solve the performance bottleneck caused by the high-frequency remove operation on the refList
  • Added blacklist configuration for database/table: supports filtering the data for some database or table which configured in the blacklist

发布说明

重要功能新增

  • 新增binlog备份恢复能力:binlog文件可实时上传至OSS、以及从OSS恢复到本地
  • 同步性能大幅提升:最大EPS能力从100w/s提升至200w/s,最大写入吞吐能力从250M/s提升至500M/s

特性更新

  • 新增binlog自动清理能力:本地磁盘binlog文件大小超过指定阈值,支持可定时自动清理
  • 内存占用降低40%:支持对元数据进行持久化、缓冲区大小优化、内存/磁盘swap策略优化、序列化/反序列化优化等
  • 更高数据一致性保证:DDL变更过程中,数据整形模块新增支持对binlog event的数据类型和精度进行整形
  • 新增历史记录清理能力:支持对ddl打标信息、集群调度历史信息等进行定时清理
  • 日志清理策略优化:内存dump文件(*.hprof)清理策略优化,降低保留文件个数阈值,规避磁盘打爆风险
  • 进程探活定时任务优化:进程探活方式,由jps命令调整为ps -ef,解决高负载场景下,因jps导致宿主机cpu抖动问题
  • CDC Recovery策略优化:解决在链路恢复过程中,当某个2PC事务跨多个DN binlog文件时,极端情况下可能导致的数据丢失问题
  • 新增参数热生效能力:自动扫描binlog_system_config配置表和 config.properties文件,自动reset内存参数值
  • binlog数据过滤性能优化:优化TxnBuffer数据结构,解决因对refList进行高频remove操作导致的性能瓶颈问题
  • 新增库表黑名单能力:全局binlog中支持针对指定库表的数据进行过滤

galaxycdc-5.4.13

2 years ago

RELEASE NOTE

Important New Feature

  • Introduce new component named Replica which can simulate PolarDB-X as mysql slave by use sql like ‘change master …’
  • Support recording Rows_query_event in Global Binlog with a precondition that should set binlog_rows_query_log_events to ON in DN Nodes

Feature enhancement

  • Optimize algorithm for TableMapId Generation to guarantee consistent ID for same table
  • Optimize algorithm with rollback for CDC meta information to resolve possible inconsistency issue in extremely scene,and introduce consistency checking module for meta
  • Support Skip Selection for disordered TraceId with a new configuration parameter
  • Add parameterization support whether record implicit primary key to Global Binlog
  • support utf8mb3 character encoding
  • support special processing for rotate event which existed in middle of DN binlog file

Bugfix

  • Fix truncation issue for TableMapId because of incorrect data type usage
  • Remove two StatusVars in Query Event(MODE_NO_ZERO_IN_DATE和MODE_NO_ZERO_DATE) to resolve program error in mysql slave when processing null timestamp columns
  • Fix Invalid Checking for Heartbeat Window integrity in LogEventMerger when meet Scale Out/In
  • Fix Type defination defect of FastSql Repository , which resulting in metadata overwritten with each other between Tables and Indexes with the same name
  • Fix file size issue in binlog_oss_reocrd table
  • Fix some bugs of rule mismatching because of missing process for back quote

发布说明

重要功能新增

  • 新增Replica组件,支持通过change master … 语法的方式将PolarDB-X作为MySQL Slave来消费数据
  • 全局Binlog中支持记录Rows_query_event类型数据,前置条件:需将DN节点binlog_rows_query_log_events参数设置为On

特性更新

  • 优化TableMapId生成算法,保证相同逻辑表ID一致
  • 优化CDC元数据Rollback算法,解决极端情况下元数据不一致问题,并引入元数据一致性校验算法
  • 针对乱序的TraceId,支持参数化配置是否进行跳过
  • 支持参数化配置,是否记录PolarDB-X隐藏主键到全局Binlog
  • 支持utf8mb3字符编码
  • 支持对DN binlog文件中间位置出现的rotate event进行特殊处理

问题修复

  • 修复因数据类型精度不够导致TableMapId被截断问题
  • 删除QueryEvent中的两个状态变量MODE_NO_ZERO_IN_DATE和MODE_NO_ZERO_DATE,解决Timstamp Column值为空时同步到下游MySQL报错问题
  • 解决Scale Out/In场景下,LogEventMerger心跳窗口完整性检测失效问题
  • 修复FastSql Repository类型区分不够精细,导致具有相同名字的Table和Index的元数据相互覆盖问题
  • 修复记录到binlog_oss_record表中的Binlog文件大小错误问题
  • 修复若干因反引号导致的规则匹配失效问题

galaxycdc-5.4.12

2 years ago

RELEASE NOTE

Feature enhancement

  • Improve DDL compatibility with MySQL
  • supports reset the frequency of heartbeat events recorded to binlog
  • Support suspending for cluster schedule
  • Support record server_id into trace_id for breaking circulation in bidirectional synchronization
  • Support instant add column feature by resolving different column order between logic schema and physical schema
  • Support GCN event for recording tso

Bugfix

  • Fix the failure of log cleanup
  • Fix data loss issue when table name contains Chinese characters
  • Fix possible blocking in scale-out or scale-in
  • Fix sql parse failure when sql statement contains backquote
  • Fix dirty metadata issue when received drop database sql and drop table sql

发布说明

特性更新

  • 针对polardb-x私有ddl sql的处理进行专项优化,remove掉私有语法,提升和mysql生态的兼容性
  • 支持可动态修改记录到全局Binlog中的心跳事件的频率
  • 支持集群调度功能可暂停
  • 支持通过traceid记录server_id,支持实例间双向同步时的循环同步问题
  • 适配polardb-x instant add column特性,保证当逻辑表和物理表列序不一致时,binlog内容的正确性
  • 适配新版DN记录TSO的方式(独立的GCN Event记录)

问题修复

  • 日志清理脚本只能清除当前最新日志,无法清除历史日志问题
  • 针对中文库表名称,因解析乱码导致库表数据被过滤,引发数据丢失问题
  • 收到ScaleOut/In打标指令后,如果指令前后storge列表并未发生变化,无法触发集群重新调度引发数据链路卡住的问题
  • 当表名当中含有反引号时,polardbx-meta-cdc模块解析失效,导致触发语法错误,引发数据链路中断
  • 针对drop database和drop table sql,meta模块未对拓扑信息进行remove,造成内存泄漏或脏数据问题