Change Data Capture in SQL Server

Change Data Capture in SQL Server

Change data capture in SQL Server is a feature that is introduced in SQL Server 2008. And is called CDC (Change Data Capture). CDC helps SQL Server developers to archiving and capturing data changes without any additional programming. CDC only tracks changes in user-created tables. Because captured data is then stored in relational tables, it can be easily accessed and retrieved.

When you enable Change Data Capture (CDC) features on a database table, a mirror of the original table is created with the same column structure, but with additional columns that include the metadata used to summarize the nature of the change in the database table row.

Enabling Change Data Capture on a Database

CDC first has to be enabled for the database. Because CDC is a table-level feature, it then has to be enabled for each table to be tracked. You can run following query and check whether it is enabled for any database.

–Check for which database CDC Enabled or not–
USE master 
SELECT [name], database_id, is_cdc_enabled  
FROM sys.databases       

DC- Change Data Capture

To enable CDC on database level you have to run following TSQL:

USE Hightechnology 
EXEC sys.sp_cdc_enable_db 

Additionally, in the database Hightechnology, you will see that a schema with the name ‘cdc’ has now been created.

Some System Tables will have been created within the Hightechnology database as part of the cdc schema.

le Change Data Capture in SQL Server

The table which have been created are listed here.

  1. cdc.captured_columns – This table returns result for list of captured column.
  2. cdc.change_tables – This table returns list of all the tables which are enabled for capture.
  3. cdc.ddl_history – This table contains history of all the DDL changes since capture data enabled.
  4. cdc.index_columns – This table contains indexes associated with change table.
  5. cdc.lsn_time_mapping – This table maps LSN number (for which we will learn later) and time.

Enabling Change Data Capture on one or more Database Tables

USE Hightechnology 
EXEC sys.sp_cdc_enable_table 
@source_schema = N’dbo’, 
@source_name   = N’EMP’, 
@role_name     = NULL 

Enable CDC on Table Level

On execute of above query it will create two jobs:

Job ‘cdc.Hightechnology_capture’ started successfully: When this job is executed it runs the system stored proceduresys.sp_MScdc_capture_job.  The procedure sys.sp_cdc_scan  is called internally bysys.sp_MScdc_capture_job. This procedure cannot be executed explicitly when a change data capture log scan operation is already active or when the database is enabled for transactional replication. This system SP enables SQL Server Agent, which in facts enable Change Data Capture feature.

Job ‘cdc.Hightechnology_cleanup’ started successfully: When this job is executed it runs the system stored procedure sys.sp_MScdc_cleanup_job. This system SP cleans up database changes tables.

On completion of CDC configuration, any change in data has been captured and can be monitored easily as shown below:

We have inserted a values in table EMP, that is tracked in CDC easily.

As you will see in below image there are five additional columns to the mirrored table:

  1. __$start_lsn
  2. __$end_lsn
  3. __$seqval
  4. __$operation
  5. __$update_mask

There are two values which are very important to us is __$operation and __$update_mask.

Column _$operation contains value which corresponds to DML Operations. Following is quick list of value and its corresponding meaning.

  • Delete Statement = 1
  • Insert Statement = 2
  • Value before Update Statement = 3
  • Value after Update Statement = 4

The column _$update_mask shows, via a bitmap,   which columns were updated in the DML operation that was specified by _$operation.  If this was  a DELETE or INSERT operation,   all columns are updated and so the mask contains value which has all 1’s in it. This mask is contains value which is formed with Bit values.

Change Data Capture in SQL Server