A Technique for Auditing User Initiated Batch Updates on SQL Server

Overview

Auditing of user actions is a common business requirement. Typically users change data on a record-at-a-time basis using an interface that allows maintenance of a single record. In this case each change results in a single entry in an audit log. Sometimes, however, a user will initiate a batch job that results in a change to many records at a time. This document describes a design pattern for auditing in the latter situation.

To distinguish this from other auditing techniques I will call this snapshot auditing.

High Level Design

The snapshot audit design pattern uses a single generic table to hold a record of the user action and a snapshot table for each table that is being audited.

In the examples that follow the single table is called Ctrl.LogSnapshot and the individual snapshot tables are called Ctrl.LogSnapshot<tableName> where <tableName> is the name of the table being audited.

A single generic stored procedure call can be made to create the required log entries in both tables.

The same stored procedure also implements a simple archiving. Currently snapshots are removed when they are over one month and at least three more recent snapshots exist for the same table.

Example

For example, suppose a user is making changes to a product mapping table dbo.MyProductMapping. The table maps an external product pack code to an internal product code, which then allows processing and matching external transactions with the internal product database. The batch job first validates all codes, and if everything is OK it updates MyProductMapping and creates the logs.

We have 2 tables to hold the logging:

  • Ctrl.LogSnapshot – holds a record for each user action
  • Ctrl.LogSnapshotMyProductMapping – holds a copy of the entire dbo.MyProductMapping table for each snapshot

The relationship between these tables is shown in the following diagram – (although these relationships should not be checked on the database because of the necessary archiving of the snapshot logs described later).

Data from LogSnapshot

Snapshot ID Snapshot Time User Table Message
12 2011-06-17 17:02 USER1 ProductMarketHierachy Loaded file ProductMarket.xls
13 2011-06-17 17:34 USER1 MyProductMapping loaded file ExportMappedProductsFixed.xls
14 2011-06-20 11:21 USER2 PolicyTolerance loaded file ExportTolerances.xls
15 2011-06-20 11:30 USER2 PolicyTolerance Loaded file NewTolerances.xls
16 2011-06-20 11:33 USER3 PolicyTolerance Loaded file NewTolerances .xls
17 2011-06-20 11:33 USER2 PolicyTolerance loaded file NewTolerances.xls
18 2011-06-20 11:51 USER4 CustomerMapping loaded file ExportMappedCustomers.xls
19 2011-06-20 11:55 USER1 MyProductMapping loaded file ExportMappedProductsFixed.xls

Data from LogSnapshotMyProductMapping

This table only contains data from snapshots #13 and #19.

Snapshot ID Rown Number Pack Code Internal Code
13 1 AAX11 100391
13 2 AAY11 100392
13 3 BBC22 100393
….
19 1 AAX11 100391
19 2 AAY11 100392
19 3 BBC22 100395
19 4 CCZ22 100396

An advantage of snapshot logging is that it is decoupled from the actual updates themselves.

Other logging methods to which this can be compared include Triggers and Envers.

Here, Triggers, means the creation of database triggers on the target table so that INSERT/UPDATE or DELETE statements result in the creation of an audit entry.

Envers (http://docs.jboss.org/hibernate/envers/3.6/reference/en-US/html_single/) is an elegant auditing solution that is part of the Hibernate ORM tool.

Triggers auditing suffers from several problems:

  1. Unwanted log entries are created if the table being audited is the target of a bulk update that should not be logged (for instance replication or data synchronisation).
  2. The business user is often not available to the trigger because the connection made using some generic principal.
  3. It is not usually possible to have a separate central logging table that holds a record per transaction or batch

Envers only works for updates that occur through Hibernate, hence is not applicable to batch updates that use stored procedures.

The Stored Procedure

Here are the highlights from the generic stored procedure that does the work.

First how is it called?

A simpler test from T-SQL is:

EXEC usp_CreateLogSnapshot

‘MYDOM\USER01’, ‘Loaded file MyProductMappingTest.xls’, dbo.’, ‘MyProductMapping’, null, null;

SELECT
*
FROM Ctrl.LogSnapshot

SELECT
*
FROM Ctrl.LogSnapshotMyProductMapping

So, what is in the procedure?

ALTER
PROCEDURE [dbo].[usp_CreateLogSnapshot](

    @LoginName as
varchar(50),

    @message varchar(100),

    @SourcePrefix varchar(100)=,

    @SourceTable varchar(100)=null,

    @ExcludeColumns varchar(1000)=null,    — comma separated list of cols to exclude from copy

    @WhereClause varchar(8000)=null

)

AS

/******************************************************************************

**    Creates a log record in Ctrl.LogSnapshot

** Copies a snapshot of records from table @SourcePrefix.@SourceTable to LogSnapshot+@SourceTable

**        i.e. if @SourceTable is MyProductMap then the records are copied to Ctrl.LogSnapshotMyProductMap

**        Return values: none

*************************************************************************/

The first part of the procedure sets up some variable to hold the dynamic SQL and checks for the existence of the target logging table.

    SET
NOCOUNT
ON;

    DECLARE @SnapshotID int;

    DECLARE @tablename sysname, @fulltablename sysname, @objectid int;

    SELECT @tablename =
‘LogSnapshot’+@SourceTable;    — This is the table that is populated with a new snapshot

    SELECT @fulltablename =
‘Ctrl.[‘+@tablename + ‘]’;        — This is the table that is populated with a new snapshot

    SELECT @objectid =
object_id(@fulltablename);

    IF @objectid IS
NULL

    BEGIN

        RAISERROR(‘The table %s does not exist. It needs to be created in order for the log snapshot to work’, 16, 1, @fulltablename);

        RETURN;

    END

The main part of the procedure is within a try catch statement whose purpose is to abort and roll back everything and re-raise the error.

    DECLARE    @SQLString varchar(max),

            @ColList varchar(max);

    BEGIN
TRY

        — Wrap extra commas at start and end of exclude column list

        SELECT @ExcludeColumns =
‘,’
+
coalesce(@ExcludeColumns,)
+ ‘,’;

        BEGIN
TRANSACTION;

— The main insert in Ctrl.LogSnapshot is ordinary SQL

        INSERT
INTO Ctrl.LogSnapshot(SnapshotDateTime,SnapshotUser,SnapshotTable,[Message])

        SELECT
getdate(), @LoginName, @SourceTable, @message;

— @SnapshotID will be used in the dynamic insert

SELECT @SnapshotID=SCOPE_IDENTITY();

— Dynamic SQL generation

— Use dynamic SQL to generate INSERT INTO @fulltablename

        SET @ColList =

        — Get column list and create insert statement

        SELECT @ColList = @ColList +
‘[‘
+ syscolumns.name +
‘],’

        FROM dbo.sysobjects

        JOIN dbo.syscolumns

        ON sysobjects.id = syscolumns.id

        WHERE sysobjects.xtype =
‘U’
and sysobjects.name = @tablename

            and syscolumns.iscomputed<>1 and syscolumns.status<>128

            and @ExcludeColumns NOT
LIKE
‘%,’
+ syscolumns.name +
‘,%’

            and syscolumns.name <>
‘SnapshotID’

        ORDER
BY syscolumns.colorder;

        — Strip last ‘,’

        SET @ColList =
left(@ColList,len(@ColList)-1);


        SET @SQLString = N‘INSERT INTO ‘
+ @fulltablename

            +
char(13)
+
‘ (SnapshotID,’
+ @ColList +
‘)’

            +
char(13)
+
‘ SELECT ‘
+
convert(varchar(10),@SnapshotID)
+
‘,’

            +
char(13)
+ @ColList

            +
char(13)
+
‘ FROM ‘
+ @SourcePrefix +
‘[‘+ @SourceTable + ‘] ‘
+
coalesce(@WhereClause,);

        — Execute the dynamic Insert statement

        EXEC
(@SQLString);

–Archiving

— This removes old entries from @fulltablename

— At the moment the policy is thus

— We want to keep least 3 newer snapshots

— We must keep all snapshots for at least 1 month

        DECLARE @NewestSnapshotIdToArchive int;

        SELECT @NewestSnapshotIdToArchive =
max(Ranked.LogSnapshotID)

        FROM

            (

                SELECT LogSnapshotID,
Rank()
over(ORDER
BY LogSnapshotID desc) as
Rank, SnapshotDateTime

                FROM Ctrl.LogSnapshot

                WHERE SnapshotTable = @SourceTable

            ) Ranked


        WHERE

            — must be at least 3 newer snapshots

            Ranked.Rank > 3

            — and must be at least 1 month old

            AND Ranked.SnapshotDateTime <=
dateadd(month,
1, getdate());

        IF @NewestSnapshotIdToArchive IS
NOT
NULL

        BEGIN

            SELECT     @SQLString=‘DELETE FROM ‘ + @fulltablename +
‘ WHERE SnapshotID <= ‘
+
convert(varchar(10),@NewestSnapshotIdToArchive);

            PRINT
‘Delete Statement:’;

            PRINT @SQLString;

            — Execute Delete statement

            EXEC
(@SQLString);

        END

        COMMIT;

        RETURN 0;

    END
TRY

    BEGIN
CATCH

        IF
(XACT_STATE())
IN
(-1,1)

            ROLLBACK
TRAN;


     EXEC dbo.usp_RethrowError;

     RETURN 1;

    END
CATCH

About renatobuda

I am a software developer and sometime dba, business intelligence analyst and data wrangler.
This entry was posted in Data Focus and tagged , , , , . Bookmark the permalink.

Leave a comment