Audit trail with schema versioning for PostgreSQL using transaction-based logging
This is a bugfix release to v0.7
. To upgrade, use the UPGRADE_v07_to_v074.sql
script or run ALTER EXTENSION pgmemento UPDATE TO '0.7.4'
within the database after installing the extension. Have fun with this release. Please, provide feedback. Stay healthy and confident. Thank you!
This is a bugfix release to v0.7
. To upgrade, use the UPGRADE_v07_to_v073.sql
script or run ALTER EXTENSION pgmemento UPDATE TO '0.7.3'
within the database after installing the extension. Have fun with this release. Please, provide feedback. Happy new year and stay healthy. Thank you!
pgMemento uses custom session variables to help logging DDL events. In Postgres 14 these must start with a letter. Thanks to @ol-teuto for reflecting this change in the pgMemento code base (#91). Older releases will not work with Postgres 14.
pgmemento.start
will call reinit
if log params differ from audit_schema_log
(#93)This is a bugfix release to v0.7
. To upgrade, use the UPGRADE_v07_to_v072.sql
script or run ALTER EXTENSION pgmemento UPDATE TO '0.7.2'
within the database after installing the extension. Have fun with this release. Please, provide feedback. Stay healthy. Thank you!
audit_column_log
for RESTORE and REVERT (#86)This is a bugfix release to v0.7
. To upgrade, use the UPGRADE_v07_to_v071.sql
script or run ALTER EXTENSION pgmemento UPDATE TO '0.7.1'
within the database after installing the extension. Have fun with this release. Please, provide feedback. Stay healthy. Thank you!
v0.7 introduced SECURITY DEFINER on many logging functions to protect pgMemento's tables better. However, when calling log_transaction it is executed with the owner role hence logging the owner, not the calling user which we want to audit in the user_name
column of transaction_log
.
Finally, pgMemento can be installed as a PostgreSQL extension. The required files were implemented by Serge Latyntsev already back in beginning of 2019. The extension will also be published at PGXN.
Until now, a table event type (e.g. UPDATE) could only be logged once per transaction per table because of the way row-level changes were referenced to the event. v0.7 allows at least one event type per statement timestamp, i.e. multiple UPDATES at different times during one txid will become two entries in the table_event_log
.
Both table_event_log
and row_log
now store an event_key
consisting of the transaction and statement timestamps (in epoch format) as well as txid, table operation, name and schema. This allows for easy temporal partitioning of those tables. Note, the foreign key column event_id
in row_log
is gone as well as the foreign key constraint. Combinations of event_key
and audit_id
are unique there.
Understanding which values got introduced on INSERTs or UPDATEs required using the restore API, so far. But, getting old and new data deltas straight from one entry in the row_log
would be more convenient and cheaper. Therefore, logging new values can now be enabled optionally. The configured logging behavior can be queried from the audit_table_log
and audit_tables
view. This feature was a request by Lenz Weber from Mayflower who use pgMemento for auditing a GraphQL API.
So far, pgMemento was using the table's OID in table_event_log
and audit_table_log
to track events and changes. But, when data is restored from backup, table OIDs are not the same anymore. With this release only the table and schema names are used. The audit_table_log
uses a new log_id
column to trace tables in case they get renamed or recreated.
This is a new log table to show for which database schemas pgMemento is currently enabled. This is essential for newly created tables to check if auditing shall be started, which audit_id name to use and if new deltas shall be logged. It can be seen as a global configuration table for pgMemento.
Admittedly, audit_id
is quite a generic name for a tracer column. This release will switch to the more specific default pgmemento_audit_id
. But, you can now choose how this column should be called. The configured name is stored in the audit_schema_log
and audit_table_log
and it's also queryable from the audit_tables
view. It can be different for every table, but using existing columns, like primary keys, is not yet supported, though. Besides, the logging triggers were renamed, too. The log_*
prefix has been replaced with pgmemento_*
.
With the audit_schema_log
in place, it is also time to allow more table creation event tags.
Interactive scripts with prompts are fine, but the world of CI/CD wants something easily scriptable. New simple endpoints are provided to init
, pause (stop
, start
) and drop
pgMemento in schemas, bundled under the CTL.sql file. It's also possible to reconfigure pgMemento's behavior for an already audited schema calling the renint
endpoint.
INSERTs into these two log tables were previously done by the log_table_event
function. Now this is split up to allow for only logging transactions without events. This is for example done by the endpoints mentioned above to track when a schema has been initialized for auditing or changed. The chosen options are then stored in the session_info
field of the transaction_log
.
DROP AUDIT_ID events were treated like TRUNCATE events before, because the drop_table_audit
function let users only choose between logging the existing state or delete everything. But, in some scenarios like reinit
you neither want to produce logs nor want to drop the audit trail. Reverting such cases required to differentiate between recreating the audit_id column and inserting the data.
On the one hand, this is to show in the logs that this table has existed before and was brought back to life. On the other hand, it also helps to reassign the same log_id
value in the audit_table_log
to connect this table with its former history.
All functions writing to the log tables will be created with SECURITY DEFINER privileges so that normal users can work with the pgMemento API but cannot change the content of the log tables if this wasn't granted to them.
Many code changes to support quoted tables and columns could be reverted back by introducing a simple get_table_oid
function in the right places. Now the code is a lot cleaner again.
As the adsrc
column was removed from the pg_attrdef
system catalog in PostgreSQL 12, it had to be replaced with pg_get_expr(adbin, adrelid, TRUE)
. Plus, the -o
option has been removed from pg_dump which was used in extension tests.
The restore_record
function now simply returns the result from the restore_query
and doesn't check if it's NOT NULL. This lead to returning rows only if all columns are NOT NULL which is likely to be not true.
It could happen that the txid_range
in the audit_table_log
was not closed on RENAME TABLE events. This also lead problems of setup scripts using the audit_table_log
, e.g. to drop auditing from tables which do not exist anymore. Note, that the UPGRADE_v061_to_v07.sql
script fixes those open ranges before doing the upgrade.
Reverting ALTER COLUMN events did not log data because the DDL event parsing stripped out the word that triggers the data logging.
As you can guess from reading this changelog with all it's schema changes on the log tables many breaking changes come with this release. Please, use the provided UPGRADE_v061_to_v07.sql
script to move your pgMemento installation to the newest version. I'd recommend to test it first on a backup before applying it against a production database. Please, refer to the wiki page about upgrading to learn about the breaking changes in more detail. Have a look at the script as well, to see which functions are dropped explicitly because their signatures have changed.
Have fun with this release. Have fun with the pgMemento extension. Please, provide feedback. Stay healthy. Thank you!
This is for the most parts a bugfix release to v0.6
ADD/DROP AUDIT_ID
events. #18audit_id
column with a query is now prohibited #18audit_table_log
(appears after dump and restore) #23Breaking changes:
restore_record
function now returns SETOF INTEGER instead of INTEGER (too return 0 rows if given audit_id
is not in the logs) #16ADD AUDIT_ID
events will get op_id
21 instead of 2 in table_event_log
#18Dedicated to my grandmother, who turns 79 today :)
Highlights are marked:
transaction_id
column in table_event_log
now references transaction_log.id
instead of txid
column. Thus, cycling txids don't matter anymore at all!
transaction_log
tabletransaction_log
columns txid
and stmt_date
. Removed constraint transaction_log_unique_txid
.stmt_date
column in transaction_log
if now filled by transaction_timestamp()
instead of statement_timestamp()
because of new UNIQUE indextxid_range
columns in audit_table_log
and audit_column_log
from '[)'
to '(]'
SETUP:
transaction_log
tabletransaction_log
allowing clients to pass more session parameters to the logs
log_table_event
to log transactions and all types of table eventsaudit_id
columncolumn_array_to_column_list
to produce a string for jsonb_build_object
log_table_state
to log values of given columns or the whole tablelog_table_state
and log_schema_state
to log_table_baseline
and log_schema_baseline
log_table_baseline
get_txid_bounds_to_table
from table_name, schema_name
to table_oid
create_table_audit
creates an ADD AUDIT_ID
event which behaves like ADD COLUMN
(op_id
= 2)drop_table_audit
now provides a flag to decide if the table's audit trail should be kept or dropped. Additionally it creates a DROP AUDIT_ID
event which behaves like a TRUNCATE event (op_id
= 8)DDL_LOG:
op_id
= 12) and RENAME COLUMN (op_id
= 22) events
table_alter_pre_trigger
, so BEFORE the change instead of AFTERmodify_ddl_log_tables
is only called if a DDL event has been registered in advanceaudit_id
log_ddl_event
because of new log_table_event
functionRESTORE:
generate_log_entry/ies
function to restore_record/set
. They now return records instead of JSONB, but output can also be a JSONB record.restore_record_definition
to provide the column definition list as TEXT needed for the restore functionsrestore_value
and restore_change
to retrieve single values from JSONB logs with correct data typeREVERT:
restore_change
function to retrieve the previous version
LOG_UTIL:
update_key
and delete_key
get_column_list_by_txid
and get_column_list_by_txid_range
to get historic table layoutsjsonb_unroll_for_update
to create syntactically correct reverse UPDATE statementsaudit_table_check
function and moved it from RESTORE to LOG_UTILdelete_table_event_log
from table_name, schema_name
to table_oid
Highlights are marked:
audit_id
column of audited tables is no longer logged in audit_column_log
information_schema
SETUP:
audit_id
column is created with unique constraint. Additional audit_idx
BTREE index is no longer needed.op_id
values in table_event_log
to enable reverts of DDL statementsaudit_column_log
audit_table_log
audit_column_log
DDL_LOG:
modify_ddl_log_tables
does not need a ddl_action argument anymoreop_id
= 8) without executing a TRUNCATE commandREVERT:
revert_distinct_transaction(s)
functions
row_log
ID - better for self-referencing tablesLOG_UTIL:
delete_key
to remove keys from JSONB logs - useful for data correctionsVERSIONING:
SCHEMA_MANAGEMENT:
default_values_*
functions as this is satisfied with audit_column_log
This is a BIG release to me which includes many very important improvements (highlights are marked):
SETUP:
table_event_log
table (transaction_id, table_relid, op_id)audit_table_log
in order to log renamed versionstable_event_log
table: ALTER COLUMN
and DROP COLUMN
(see DDL_LOG changes)create_table_audit
to log existing data as inserted (log_table_state
)log_table_state
function from VERSIONING.sql to SETUP.sqllog_schema_state
is using the dependency depth from audit_tables_dependency
to mark INSERTs in correct orderLOG_UTIL:
audit_tables_dependency
to make reverts and other schema-based procedures more robust
get_txid_bounds_to_table
function to create a correct audit_tables
viewget_txid_bounds_to_table
to SETUP.sqljsonb_merge
to merge JSONB objects (very helpful for revert_distinct_transaction
)VERSIONING:
generate_log_entries
to look at a past table state without creating a TABLE or VIEW
jsonb_populate_record
can now be created with create_restore_template
to be used for on-the-fly restoresCREATE TABLE AS
command to avoid firing the event triggerREVERT:
audit_tables_dependency
view what makes them more robust against foreign key violationsrecover_audit_version
.jsonb_merge
trickDDL_LOG:
DROP [SCHEMA, TABLE, COLUMN]
or ALTER COLUMN ... TYPE ... USING
events occur
CREATE TABLE
events will not fire for temprary tablesCREATE TABLE
, DROP TABLE
and ADD COLUMN
events in transaction_log
and table_event_log
SETUP:
VERSIONING:
SCHEMA_MANAGEMENT:
REVERT:
LOG_UTIL:
DDL_LOG: