T-SQL Tuesday #031 – Logging

by Greg M Lucas on 12 June 2012

TSQL Tuesday 12-Jun-2012

This month’s T-SQL Tuesday is hosted by Aaron Nelson (blog | twitter) with a theme of “logging”.  Although Aaron’s definition of this topic is deliberately broad, I thought I would go with a more typical definition and talk about the approach I use for logging the actions of large multi-step stored procedures.

Some of the work I do revolves around supporting and/or refactoring some pretty large ETL routines where often much of the work is carried out or at least managed by a single stored procedure which can have many, many individual steps.  In the course of this type of work I’ve developed an approach to logging that has culminated in Log4TSql, an open source logging and exception handling library for SQL2005+.  I’ve blogged previously about the exception handling part of this framework for T-SQL Tuesday #024 and this month’s T-SQL Tuesday theme leads nicely into the other half of Log4TSql.

One of the challenges in logging the actions of a procedure that has so many steps is what to do with the information.  Storing each individual log entry in a permanent table means increasing the pressure on disk resource during what might already be a very I/O intensive process.  And if some or all of the load steps are wrapped in an explicit transaction and the code hits an error and needs to rollback, then you also lose the information that might help diagnose the very reason that the problem occurred in the first place.  One solution to the rollback problem lies in using a table variable to store the log items which can then be permanently stored after any transaction handling has completed but that approach can still have an I/O impact whilst the main code is processing.

So the approach I prefer is to use a varchar(max) variable within the procedure being logged and append each log entry to that, separated by some sort of line feed. This has the advantage of not being affected by any explicit transaction handling and also under normal circumstances has no disk I/O overhead.  I like to separate each line with a carriage return to make it easier to read later.  We will see how Log4TSql makes it simple to retrieve that information in an easy to read format later in this article.

Probably the best way to demonstrate the approach I am proposing is by example.  Most of the local variables used in the sample procedure below should be fairly self-explanatory but I’ve added a few thoughts about some of them here.

  • @_Step – I always name each individual step in a multi-step procedure. This facilitates simpler logging using copy & paste code blocks and also makes it easy to identify the point of failure in case of error without having to use multiple TRY…CATCH blocks.
  • @_Message – This is a succinct completion message indicating either that the process failed and why or that the process completed successfully.
  • @_ErrorContext – This is used within the CATCH block to state what the code was doing at the time an exception occurred and is passed to Log4TSql’s ExceptionHandler procedure
  • @_SprocStartTime – The time the stored procedure started executing (this is then used at the end of processing to record the total run time)
  • @_StepStartTime – Set at the beginning of each individual step within the procedure and is used to record the run time of that step
  • @_ProgressText – This is where we store the outcome of each individual step, number of rows, duration etc. Think of this like a log file, where each outcome is appended as a line of text.

Also, a quick style note: I really like the fact that in SQL2008 we gained the ability to declare and populate a variable in a single statement thus DECLARE @_Error int = 0; so now I tend to use a similar single-line style when writing backwards compatible code for SQL2005 which goes like this: DECLARE @_Error int; SET @_Error = 0;.

The aim of this type of logging is to record as much information as possible with minimal impact on performance. I like to include the number of rows affected and the duration of each step and will sometimes even include information from each iteration of a loop or cursor. This can help in identifying what was happening in the build up to any failure as well as being able to quickly tell historically which step is the bottleneck if performance suddenly falls off a cliff.

One thing to be really careful of when appending new lines to @_ProgressText is that you protect against NULL values as one NULL may wipe out all your logging information (depending on your “concat null yields null” setting).  Although, the ability to turn CONCAT_NULL_YIELDS_NULL off is more or less deprecated and in future versions trying to do so may generate an error so the sooner we learn to write defensively for NULLs, the better.

This might seem like a lot of extra code to write so whenever possible, I try to use the following format for each step, which means that I can copy & paste the same block as a starting point then just change the value of @_Step and write the actual code.

SET @_Step = 'Do something meaningful';
SET @_StepStartTime = GETDATE();

--!
--! Do the work here
--!

SET @_RowCount      = @@ROWCOUNT;
SET @_ProgressText  = @_ProgressText + @NEW_LINE
        + 'Step: [' + @_Step + '] processed ' + COALESCE(CAST(@_RowCount AS varchar), 'NULL')
        + ' row(s) in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)

If you want to actually run this code, you can download the code for the [TSqlTuesday031LoggingExample] procedure here.  You will also need to install the current version of Log4TSql into the database you are using for this demonstration.

CREATE PROCEDURE [dbo].[PointlessStoredProcedure]
(
  @SomeDate      datetime
, @RowsAffected  int = NULL  OUT
)
AS
BEGIN
    WAITFOR DELAY '00:00:00.636';
    SET @RowsAffected = 999;
    RETURN (0);
END
GO

IF OBJECTPROPERTY(OBJECT_ID(N'[dbo].[TSqlTuesday031LoggingExample]'), N'IsProcedure') = 1
    DROP PROCEDURE [dbo].[TSqlTuesday031LoggingExample];
GO

CREATE PROCEDURE [dbo].[TSqlTuesday031LoggingExample]
(
  @JobName                       nvarchar(128)  = NULL
, @SomeDate                      datetime       = NULL
, @NumberOfPointlessInsertsToDo  int            = 250
)

AS

BEGIN
    SET NOCOUNT ON

    --! Standard/ExceptionHandler variables (use SETs to keep it SQL2005 compatible)
    DECLARE @_Error int; SET @_Error = 0;
    DECLARE @_RowCount int;
    DECLARE @_Step varchar(128);
    DECLARE @_Message nvarchar(512);
    DECLARE @_ErrorContext nvarchar(512);

    --! JournalWriter variables (use SETs to keep it SQL2005 compatible)
    DECLARE @_FunctionName varchar(255); SET @_FunctionName = OBJECT_NAME(@@PROCID);
    DECLARE @_SprocStartTime datetime; SET @_SprocStartTime = GETDATE();
    DECLARE @_StepStartTime datetime;
    --!
    --! We will use @_ProgressText like a log file, writing lines of text
    --! separated by a line feed
    --!
    --! NB: INSTANTIATE @_ProgressText PROPERLY ON FIRST USE
    --! If this variable ends up as NULL at any point, you will lose all your
    --! logged info
    --!
    DECLARE @_ProgressText nvarchar(max); SET @_ProgressText = '';
    DECLARE @NEW_LINE char(1); SET @NEW_LINE = CHAR(10);

    --!
    --! If some of the input parameters for this procedure are time-sensitive,
    --! it can often be helpful to record what the inputs were at run time
    --!
    --! NOTE: With the exception of @_FunctionName and @_SprocStartTime which
    --! we've set within our code and have absolute control over, we use
    --! COALESCE() or ISNULL() for everything else.  We don't want one
    --! accidental NULL wiping out our carefully collected log detail.
    --!
    SET @_ProgressText = @_FunctionName + ' starting at ' + CONVERT(char(23), @_SprocStartTime, 121) + ' with inputs: '
            + @NEW_LINE + '    @JobName                      : ' + COALESCE(@JobName, 'NULL')
            + @NEW_LINE + '    @SomeDate                     : ' + COALESCE(CONVERT(char(11), @SomeDate, 113), 'NULL')
            + @NEW_LINE + '    @NumberOfPointlessInsertsToDo : ' + COALESCE(CAST(@NumberOfPointlessInsertsToDo AS varchar), 'NULL')
            + @NEW_LINE

    --!
    --! Within the TRY...CATCH... block below, we have a series of steps to
    --! simulate the multiple steps in a data load or other complex process.
    --!
    BEGIN TRY
        --! Use an explicit txn as we have multiple steps which must
        --! either all complete or all fail
        BEGIN TRAN;

        --!
        --! As we have multiple steps, we need to identify each step
        --! starting by checking that our inputs are valid. The value
        --! of @_Step can be used to simplify log collection and is
        --! also used in case of an error so we can record exactly
        --! where the exception occurred.
        --!
        SET @_Step = 'Validate inputs';

        IF @JobName IS NULL
            RAISERROR('Found invalid Job Name: %s', 16, 1, @JobName);

        IF @SomeDate IS NULL
            BEGIN
                SET @_Message    = 'Found some invalid date: ' + COALESCE(CONVERT(char(11), @SomeDate, 113), 'NULL');
                RAISERROR(@_Message, 16, 2);
            END

        IF ISNULL(@NumberOfPointlessInsertsToDo, 0) NOT BETWEEN 10 AND 250
            SET @NumberOfPointlessInsertsToDo = 250;

        SET @_Step = 'Define working values';

        DECLARE @pointlessDividend float; SET @pointlessDividend = 999.000999;

        --!
        --! By the time we get here we have validated our critical inputs and
        --! defined some working variables.  As these may be time-sensitive,
        --! we can record them here in the same way that we did for the inputs
        --!
        --! NOTE: That from this point on we must append new information to the
        --! existing value of @_ProgressText and always remember to use ISNULL()
        --! or COALESCE() against every variable to ensure we don't wipe out
        --! the log record with a NULL.
        --!
        SET @_ProgressText  = @_ProgressText
                            + @NEW_LINE + 'Working values: '
                            + @NEW_LINE + '    @NumberOfPointlessInsertsToDo : ' + COALESCE(CAST(@NumberOfPointlessInsertsToDo AS varchar), 'NULL')
                            + @NEW_LINE + '    @pointlessDividend            : ' + COALESCE(CAST(@pointlessDividend AS varchar), 'NULL')
                            + @NEW_LINE

        --!
        --! Name the step and also the time the step started
        --! so we can record the duration of this step later
        --!
        SET @_Step = 'Pointless Inserts Loop';
        SET @_StepStartTime = GETDATE();

        --! Put some dummy time wasting in here
        WAITFOR DELAY '00:00:04.045';

        DECLARE @PointlessTable table (CycleNumber int, Narrative varchar(64));
        DECLARE @CycleNumber int, @Narrative varchar(64);
        DECLARE @LoopCount int, @InsertCount int;

        SELECT @CycleNumber = 1, @LoopCount = 0, @InsertCount = 0;

        SET @_ProgressText  = @_ProgressText + @NEW_LINE + 'Starting Step: [' + @_Step + ']... ';

        WHILE @CycleNumber <= @NumberOfPointlessInsertsToDo
            BEGIN
                SET @LoopCount = @LoopCount + 1;

                IF @CycleNumber % 2 = 0
                    BEGIN
                        SET @Narrative = 'Now processing cycle number ' + CAST(@CycleNumber AS varchar(16)) + '...';

                        INSERT @PointlessTable VALUES(@CycleNumber, @Narrative);

                        SET @InsertCount = @InsertCount + 1;

                        --! We can even log each step with a loop or cursor
                        SET @_ProgressText  = @_ProgressText + @NEW_LINE
                                + ' - Inserting row: ' + COALESCE(@Narrative, 'NULL');
                    END

                SET @CycleNumber = @CycleNumber + 1;
            END

        --!
        --! Append more information about what we did as a new line, remembering to
        --! guard against any NULLs
        --!
        --! We use log4.FormatElapsedTime() to get a nicely formatted duration string
        --! Inputs are: start time and end time (which will default to GETDATE())
        --! plus the number of seconds duration at which to show whole seconds.
        --! So if we pass in 3 anything up to 3999 ms will appear as "1345 milliseconds"
        --! and if 4 secs or mor as "1 min(s) and 22 sec(s)"
        --!
        SET @_ProgressText  = @_ProgressText + @NEW_LINE + @NEW_LINE
                            + 'Inserted ' + COALESCE(CAST(@InsertCount AS varchar), 'NULL') + ' pointless row(s)'
                            + ' from ' + COALESCE(CAST(@LoopCount AS varchar), 'NULL') + ' even more pointless iterations'
                            + ' in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)
        --!
        --!
        --!
        SET @_Step = 'Copy pointless rows';
        SET @_StepStartTime = GETDATE();

        --! Put some more time wasting in here
        WAITFOR DELAY '00:00:02.450';

        SELECT CycleNumber, Narrative INTO #pointless FROM @PointlessTable

        SET @_RowCount = @@ROWCOUNT;

        --!
        --! This is just a simple recording of rowcount and duration
        --! If you had a lot of steps in turn, you could use this generic format
        --! over and over again. Because we define the value of @_Step for each
        --! step, the logging is just a copy and paste of this boiler plate code
        --!
        SET @_ProgressText  = @_ProgressText + @NEW_LINE
                            + 'Step: [' + @_Step + '] processed ' + COALESCE(CAST(@_RowCount AS varchar), 'NULL')
                            + ' row(s) in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)

        --!
        --! Use Itzik Ben-Gan's virtual numbers table to generate some activity
        --! http://www.sqlmag.com/article/sql-server/virtual-auxiliary-table-of-numbers
        --!
       SET @_Step = 'Even more pointless activity';
        SET @_StepStartTime = GETDATE();
        WITH
              N0 AS (SELECT 1 AS n UNION ALL SELECT 1)
            , N1 AS (SELECT 1 AS n FROM N0 t1 CROSS JOIN N0 t2)
            , N2 AS (SELECT 1 AS n FROM N1 t1 CROSS JOIN N1 t2)
            , N3 AS (SELECT 1 AS n FROM N2 t1 CROSS JOIN N2 t2)
            , N4 AS (SELECT 1 AS n FROM N3 t1 CROSS JOIN N3 t2)
            , N5 AS (SELECT 1 AS n FROM N4 t1 CROSS JOIN N4 t2)
            , N6 AS (SELECT 1 AS n FROM N5 t1 CROSS JOIN N5 t2)
            , nums AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS num FROM N6)
        SELECT
            num AS [Number]
        INTO
            #morePointless
        FROM
            nums
        WHERE
            num <= 1000000

        SET @_RowCount        = @@ROWCOUNT;
        SET @_ProgressText  = @_ProgressText + @NEW_LINE
                            + 'Step: [' + @_Step + '] processed ' + COALESCE(CAST(@_RowCount AS varchar), 'NULL')
                            + ' row(s) in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)

        --!
        --! Final pointless step
        --! An example of how a procedure might be called to do something and
        --! report back the number of rows processed which can be used by the
        --! same boiler-plate block of code
        --!
        SET @_Step = 'Call a pointless sproc';
        SET @_StepStartTime = GETDATE();

        EXEC dbo.PointlessStoredProcedure @SomeDate, @_RowCount OUT;

        SET @_ProgressText  = @_ProgressText + @NEW_LINE
                            + 'Step: [' + @_Step + '] processed ' + COALESCE(CAST(@_RowCount AS varchar), 'NULL')
                            + ' row(s) in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)

        IF @@TRANCOUNT > 0 COMMIT TRAN;
        --!
        --! If we get here everything was successful so generate a useful
        --! success message.  We have to populate @_Message as a minimum
        --! for logging to be at all useful
        --!
        SET @_Message   = 'Successfully completed all steps, processing '
                        + COALESCE(CAST(@_RowCount AS varchar(16)), 'NULL') + ' pointless row(s)'
                        + ' for Some Date: ' + COALESCE(CONVERT(char(11), @SomeDate, 113), 'NULL');
    END TRY
    BEGIN CATCH
        --!
        --! Capture the value of @_Step at the point of failure and any other
        --! valuable info
        --! NB: Always use ISNULL() or COALESCE() to ensure that any unexpectedly
        --! missing values to do not prevent the rest of the info being collected
        --!
        SET @_ErrorContext    = 'Failed to demonstrate logging for t-SQL Tuesday #031 at step: ' + COALESCE('[' + @_Step + ']', 'NULL')
                                + ' for Some Date: ' + COALESCE(CONVERT(char(11), @SomeDate, 113), 'NULL');

        IF ABS(XACT_STATE()) = 1
            BEGIN
                ROLLBACK TRAN;
                SET @_ErrorContext = @_ErrorContext + ' (Rolled back all changes)';
            END

        --!
        --! Call ExceptionHandler supplying the context under which the error
        --! ocurred and collecting the error number and formatted error message.
        --!
        EXEC log4.ExceptionHandler
                  @ErrorContext  = @_ErrorContext
                , @ErrorNumber   = @_Error OUT
                , @ReturnMessage = @_Message OUT;
    END CATCH

--/////////////////////////////////////////////////////////////////////////////////////////////////
OnComplete:
--/////////////////////////////////////////////////////////////////////////////////////////////////

    --! Clean up
    IF OBJECT_ID(N'tempdb..#pointless') > 0 DROP TABLE #pointless;
    IF OBJECT_ID(N'tempdb..#morePointless') > 0 DROP TABLE #morePointless;

    --!
    --! Use log4.FormatElapsedTime() to get a nicely formatted complete run time
    --! (as opposed to step duration) in a nicely formatted string
    --!
    IF @_Error = 0
        BEGIN
            SET @_Step            = 'OnComplete'
            SET @_Message        = COALESCE(@_Message, @_Step)
                                + ' in a total run time of ' + log4.FormatElapsedTime(@_SprocStartTime, NULL, 3)
            SET @_ProgressText  = @_ProgressText + @NEW_LINE + @_Message;
        END
    ELSE
        BEGIN
            SET @_Step            = COALESCE(@_Step, 'OnError')
            SET @_Message        = COALESCE(@_Message, @_Step)
                                + ' after a total run time of ' + log4.FormatElapsedTime(@_SprocStartTime, NULL, 3)
            SET @_ProgressText  = @_ProgressText + @NEW_LINE + @_Message;
        END

    --!
    --! We call JournalWriter outside of the scope of any transaction handling
    --! to try and ensure that we persist any useful info (this is naturally
    --! dependant on whether the caller has wrapped us in it's own transaction)
    --!
    EXEC log4.JournalWriter
              @Task                = @JobName
            , @FunctionName        = @_FunctionName
            , @StepInFunction    = @_Step
            , @MessageText        = @_Message
            --! Supply all the progress info after we've gone to such trouble to collect it
            , @ExtraInfo        = @_ProgressText

    --! Finally, throw any exception that will be detected by the caller
    IF @_Error > 0 RAISERROR(@_Message, 16, 99);

    SET NOCOUNT OFF;

    --! Return the value of @@ERROR (which will be zero on success)
    RETURN (@_Error);
END

After running multiple steps and appending the outcome of each to @_ProgressText, we must always populate @_Message with some final completion state (be that success or failure) and optionally, the duration of the entire process. Once we have all that information we pass it to log4.JournalWriter where @_Message along with most of the other values get stored in a table called log4.Journal and the value of @_ProgressText gets stored in log4.JournalDetail.

If you’ve got as far as adding all this code into a test database, you can try running the procedure a couple of times like this.  The first execution will generate an error but both attempts will log detailed information.

EXEC [dbo].[TSqlTuesday031LoggingExample]
  @JobName   = 'TSQLTuesday #031'
, @SomeDate  = NULL
GO

DECLARE @date datetime; SET @date = '20020202 22:22:22'
EXEC [dbo].[TSqlTuesday031LoggingExample]
  @JobName                       = 'TSQLTuesday #031'
, @SomeDate                      = @date
, @NumberOfPointlessInsertsToDo  = 12
GO

You can use EXEC log4.JournalReader; to review the most recent Journal entries which might look like this…

Results for log4.JournalReader call

This gives us an overview of the tasks that have been logged, the procedures involved in those tasks, the outcome and duration of each procedure call. If we want to look at a particular run in more detail we can query the [JournalDetail] table to access the progress information we captured at each step within the above procedure. For example…

SELECT * FROM log4.JournalDetail WHERE JournalId = 13;

If, like me, you normally view SSMS results in grid mode, you will end up with a result like this…

SELECT from JournalDetail with results to GRID

This is not very helpful as any carriage returns we added to make the log easy to read are not visible in grid mode.  Running the same query with “results to text” would solve this but then I have to remember to keep switching between grid and text mode. A cleaner alternative is to use another Log4TSql module: EXEC log4.JournalPrinter 13; which prints the results out correctly formatted like this:

Output from log4.JournalPrinter call

We can now review the activity and run times for each step within our single procedure call in a nice, user-friendly format.

Incidentally [JournalPrinter] makes use of a stored procedure called [PrintString] which is part of Log4TSql and is designed to overcome the 8000 character limit in SSMS result sets and also the length limititions of the PRINT command in SQL Server. If you want to know more about this I’ve blogged about it here.

I know that this level of logging may be overkill in some situations but in others it can be a real time saver.

Given the choice, the way I prefer to write these monolithic procedures is to break out each logical unit of work into its own procedure – for example the DELETE from and INSERT into a single table.  I like to think of each of these as “workers” and then all the workers get called in sequence by a single “manager” procedure.  If using a pattern like this, you might ask why each worker isn’t responsible for its own logging.  There are two reasons why I don’t do it that way; 1) you again run the risk of losing log information if transactions are rolled back by the manager and 2) the simpler each worker procedure is, the easier it is to maintain and, if necessary, tune.

Just as a side note, if I am using the manager-worker pattern, each worker gets called like this.  Note that the worker sproc always provides the number of rows affected as an output parameter so that the manager sproc can utilise that information in the log record.

SET @_Step = 'Call a pointless sproc';
SET @_StepStartTime = GETDATE();

EXEC dbo.PointlessStoredProcedure @SomeDate, @_RowCount OUT;

SET @_ProgressText  = @_ProgressText + @NEW_LINE
        + 'Step: [' + @_Step + '] processed ' + COALESCE(CAST(@_RowCount AS varchar), 'NULL')
        + ' row(s) in ' + log4.FormatElapsedTime(@_StepStartTime, NULL, 3)

 

Log4TSql is an open source logging framework for SQL Server 2005+ written by Greg M Lucas on behalf of data-centric solutions ltd and licensed for public use under the GNU Lesser General Public License. You can download the latest version of this library along with all the source code from sourceforge. Please feel free to use and modify the library as you wish under the terms of the license. I am always interested in hearing how other people or organisations have put this framework to use.

Comments on this entry are closed.

Previous post:

Next post: