What is Change Data Capture (CDC) and how is it enabled in SQL Server

What is Change Data Capture (CDC) and how is it enabled in SQL Server
What is Change Data Capture (CDC) and how is it enabled in SQL Server

Change Data Capture known as CDC records that DML (insert, update, delete) action is performed on a table of a database. These changes are recorded in a corresponding table that is created. Through system functions we can find the changes made to records.

Which is used

The CDC it is very useful for ETL (Extraction, Transformation and Load) processes where we want a gradual loading of the data in a Data Warehouse.

In a Data Warehouse we are interested in depicting the changes over time in a different format, so a complete copy of the data would be delayed and would not meet the needs of the application.

However, it could also be used for necessities auditing but with the main disadvantage that it does not record SELECT which may take place.

Step by step activation with example

We have a table with few entries:

select * from customer;
What is Change Data Capture (CDC) and how is it enabled in SQL Server

To enable CDC on the database:

USE db_test
GO  
EXEC sp_changedbowner 'sqlowner' --Προαιρετικά
GO
EXEC sys.sp_cdc_enable_db;
GO 

We enable CDC in the panel we saw before:

USE db_test  
GO  
  
EXEC sys.sp_cdc_enable_table  
@source_schema = N'dbo',  
@source_name   = N'customer',  
@role_name     = N'cdc_viewer',  
@filegroup_name = null,  
@supports_net_changes = 1  ;
GO  

As soon as we activate it we will see that some have been created system boards (01) in the base, two new Jobs in SQL Server Agent (02) and finally one or two functions (03) (depending on the parameter you will read below).

What is Change Data Capture (CDC) and how is it enabled in SQL Server
01

What is Change Data Capture (CDC) and how is it enabled in SQL Server
02

What is Change Data Capture (CDC) and how is it enabled in SQL Server
03

In the parameter @role_name we define the role * that we want to give to users to have access to the data (of course they must also have right SELECT in the source table). If this parameter is not set then the data will be readable only by whoever has the role sysadmin or is db_owner on the basis.

*If the role does not exist, it is created automatically.

With the parameter @supports_net_changes = 1 we declare that we want the function to be created as well cdc.fn_cdc_get_net_changes. This allows us to see the records that changed in a certain time period (it returns a unique change for each change in that time period).

Let's do some DML actions (insert, update, delete) on the table:

insert into db_test..customer values
('Marios','Georgopoulos','Male','209285915');

update  customer set sex = 'Female' where id = 1;

delete from customer where id = 3;

To find the changes

To see the changes that have occurred in the table, Microsoft suggests the use of functions cdc.fn_cdc_get_all_changes and cdc.fn_cdc_get_net_changes that are created.

With the cdc.fn_cdc_get_all_changes we can see all the changes that have happened:

DECLARE @from_lsn binary(10), @to_lsn binary(10);
SET @from_lsn = sys.fn_cdc_get_min_lsn('dbo_customer') 
SET @to_lsn   = sys.fn_cdc_get_max_lsn(); 

SELECT * FROM [cdc].[fn_cdc_get_all_changes_dbo_customer](@from_lsn, @to_lsn, N' all update old');  
GO  
What is Change Data Capture (CDC) and how is it enabled in SQL Server
04

We see the changes we made before along with some extra columns with metadata. With the column $start_lsn we are told the commit LSN that performed the transaction. With the $seqval we can enqueue changes made in the same transaction. THE $update_mask reports in bits the columns that changed in case of update.

In $ operation report on the type of change made. More specifically:

  • 1 = delete
  • 2 = insert
  • 3 = the registration before the update
  • 4 = the registration after the update

But we can see with the function cdc.fn_cdc_get_net_changes that we mentioned before the changes that happened in happened in a certain period of time e.g. from the previous day to a certain time:

DECLARE @from_lsn binary(10), @to_lsn binary(10), @end_time datetime,@begin_time  datetime;
SET @begin_time = GETDATE()-1
SET @end_time = '2020-10-25 15:10:00.000'
SET @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', @begin_time);  
SET @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @end_time);  

SELECT * FROM [cdc].[fn_cdc_get_net_changes_dbo_customer](@from_lsn, @to_lsn, N' all');  
GO  
What is Change Data Capture (CDC) and how is it enabled in SQL Server
05

We see that up to this time only the first insert of the changes we made to the table had been made.

To disable CDC on the panel

We can find which tables have CDC enabled with the field is_tracked_by_cdc in the system view sys.tables:

select * from sys.tables where is_tracked_by_cdc = 1

To disable it corresponding to enabling it:

USE db_test  
GO  
  
EXEC sys.sp_cdc_disable_table    
@source_schema = N'dbo',  
@source_name   = N'customer',  
@capture_instance = N'dbo_customer' ;

GO  

Sources:

Share it

Leave a reply