Advanced Database Unit Testing with tSQLt – Testing cross-database tables

by Greg M Lucas on 20 December 2011

This is the first in an occassional series of articles covering more advanced ways of getting the most out of the tSQLt database unit-testing framework. One of the features of this framework I really like is the ability to fake or mock a table. FakeTable temporarily re-names the “production” table then creates an empty copy with the original name but without all the encumbrances associated with not null, foreign keys and other constraints. This makes it very easy to test a module that depends on that table without the need to do any more setup than necessary. However, for perfectly good reasons, this feature doesn’t support creating a fake table in another database. Recently, I’ve been doing some fairly major refactoring on an archiving process for a customer. As part of this refactoring, I needed to test that data is moved from one database to another (on the same instance) and this post discusses the options available for faking tables in other databases.

In this particular scenario, the original component contains some two dozen steps to move data from a set of tables in one database to a different set of tables in another (archive) database on the same instance. As part of an overall optimisation strategy, I introduced a sliding window partitioning scheme which means that the existing archive process needed to be substantially re-written – along with supporting unit tests.

If you want to follow along with this example, you will need to create two empty databases, “LocalDB” and “OtherDB”. You will also need to install tSQLt in both databases (instructions here).

Then apply this script. In [OtherDB], this script drops then creates a table called [PositionArchive] and adds some data that we would expect to remain untouched by any tests. In [LocalDB], this script will drop and re-create an empty table called [Position], a view called [PositionArchiveView] and two stored procedures called [ArchivePositionUsingRemoteTable] and [ArchivePositionUsingLocalView] all of which more later. The [Position] and [PositionArchive] tables look like this:

CREATE TABLE [dbo].[Position]
(
  ValuationDate  date           NOT NULL
, AccountId      int            NOT NULL
, Ticker         varchar(16)    NOT NULL
, Quantity       decimal(18,4)  NOT NULL
, Price          decimal(18,6)  NOT NULL
, Valuation      decimal(28,4)  NOT NULL
, Currency       char(3)        NOT NULL
, CONSTRAINT PK_Position PRIMARY KEY (ValuationDate, AccountId, Ticker)
)

In all the tests, we want to do the following:

  • Mock both tables to isolate any dependencies
  • Add some data to [Position], some of which will be archived (based on ValuationDate), some of which won’t
  • Add some data to [PositionArchive] for the archive date under test to prove that rows don’t get archived twice
  • Record what we expect the archived data to look like on completion (see below)
  • Exercise the code (calling either [ArchivePositionUsingRemoteTable] or [ArchivePositionUsingLocalView])
  • Collect the results from the archive table
  • Assert that the actual results match the expected results

Just a brief note on the code we use to populate the [expected] table… I came across this post from Rob Farley (blog | twitter) which demonstrated another great use for the VALUES clause in SQL 2008 which is a great way of easily and quickly inserting test data.

SELECT *
INTO [ArchiveTests].[expected]
FROM
(
    VALUES('17-NOV-2011', 1, 'BARC', 100, 1.7145, 'GBP', 171.4500)
        , ('17-NOV-2011', 1, 'BP.', 500, 4.4575, 'GBP', 2228.7500)
        , ('17-NOV-2011', 1, 'GSK', 450, 14.4300, 'GBP', 6493.5000)
        , ('17-NOV-2011', 2, 'BARC', 500, 1.7145, 'GBP', 857.2500)
        , ('17-NOV-2011', 2, 'IAP', 1000, 3.348, 'GBP', 3348.0000)
        , ('17-NOV-2011', 2, 'RBS', 2000, 0.2000, 'GBP', 400.0000)
        , ('17-NOV-2011', 2, 'PRU', 5000, 6.095, 'GBP', 30475.0000)
        , ('17-NOV-2011', 3, 'BP.', 100, 4.4575, 'GBP', 445.7500)
        , ('17-NOV-2011', 3, 'ITV', 250, 0.618, 'GBP', 154.5000)
        , ('17-NOV-2011', 3, 'MKS', 250, 3.077, 'GBP', 769.2500)
        , ('17-NOV-2011', 3, 'SBRY', 250, 2.877, 'GBP', 719.2500)
) AS [x] (ValuationDate, AccountId, Ticker, Quantity, Price, Currency, Valuation)
;

Ordinarily, if both source and destination tables are in the same database, I would mock each table like this EXEC tSQLt.FakeTable '[dbo].[Position]'. So in my first attempt, I might just try faking the remote table using a three-part name like this:

EXEC tSQLt.FakeTable '[dbo].[Position]';
EXEC tSQLt.FakeTable '[OtherDB].[dbo].[PositionArchive]';

This first example calls [ArchivePositionUsingRemoteTable] which inserts data from [Position] directly in to [OtherDB].[dbo].[PositionArchive]
As expected, and the whole point of this post, this doesn’t work, resulting in the following error:

[ArchiveTests].[Test_ArchivePositionUsingRemoteTable_fake_table_from_local_db] failed: FakeTable could not resolve the object name, '[OtherDB].[dbo].[PositionArchive]'. Be sure to call FakeTable and pass in a single parameter, such as: EXEC tSQLt.FakeTable 'MySchema.MyTable'{FakeTable,23}

+----------------------+
|Test Execution Summary|
+----------------------+

|No|Test Case Name                                                                |Result|
+--+------------------------------------------------------------------------------+------+
|1 |[ArchiveTests].[Test_ArchivePositionUsingRemoteTable_fake_table_from_local_db]|Error |
-----------------------------------------------------------------------------
Msg 50000, Level 16, State 10, Line 1
Test Case Summary: 1 test case(s) executed, 0 succeeded, 0 failed, 1 errored.
-----------------------------------------------------------------------------

One solution, which will work, is to use a view as an interface between the stored procedure in [LocalDB] that does the work and the destination table in [OtherDB]. The view, created in [LocalDB] might look like this:

CREATE VIEW [dbo].[PositionArchiveView]
AS
    SELECT
          ValuationDate
        , AccountId
        , Ticker
        , Quantity
        , Price
        , Valuation
        , Currency
    FROM
        [OtherDb].[dbo].[PositionArchive]

And the stored procedure that does the work would look something like this:

CREATE PROCEDURE [dbo].[ArchivePositionUsingLocalView]
(
  @DateToArchive date
)
AS
BEGIN
    INSERT [dbo].[PositionArchiveView]
    (
      ValuationDate
    , AccountId
    , Ticker
    , Quantity
    , Price
    , Valuation
    , Currency
    )
    SELECT
          src.ValuationDate
        , src.AccountId
        , src.Ticker
        , src.Quantity
        , src.Price
        , src.Valuation
        , src.Currency
    -----------------------------------------------------------
    FROM [dbo].[Position] AS [src]
    -----------------------------------------------------------
    LEFT JOIN [dbo].[PositionArchiveView] AS [tgt]
        ON tgt.ValuationDate = src.ValuationDate
        AND tgt.AccountId = src.AccountId
        AND tgt.Ticker = src.Ticker
    -----------------------------------------------------------
    WHERE src.ValuationDate = @DateToArchive
    AND tgt.Ticker IS NULL
    ;
END

This is a nice elegant solution, interface-based programming is an established and well used pattern. It serves to isolate the procedure that does the work from any database dependency and provides more flexibility for the future, if the other database or the table schema changes for example.

Workaround Number One

So now, in my next attempt, I can use FakeTable to mock the view. This is a great little tSQLt feature wherein the view is temporarily renamed, and a user table called [PositionArchiveView] is created which mirrors the structure of the original view. Anything that referenced the view now references the table almost transparently.

One thing to be aware of when faking more complex views is what happens when SQL Server deems that the view is not updateable. Fortunately, tSQLt can deal with this and Sebastian Meine (blog | twitter), one of the authors of tSQLt, describes here how to fake such non-updateable views.

This is what my fake table calls look like in that test:

EXEC tSQLt.FakeTable '[dbo].[Position]';
EXEC tSQLt.FakeTable '[dbo].[PositionArchiveView]';

Whilst using a view as an interface is simple, you may recall that I have some two dozen individual steps to write tests for and although some of those steps reference some of the same tables, using this pattern still means writing an awful lot of views, so I started thinking about a different approach.

Workaround Number Two

In my final attempt, to fake the archive table, I actually call FakeTable in [OtherDB] like this.

EXEC tSQLt.FakeTable '[dbo].[Position]';
EXEC OtherDB.tSQLt.FakeTable '[dbo].[PositionArchive]';

This appears to work in the same way as if the table was being faked in the current database context, all nicely wrapped up in a transaction. It also answers the question you might have had earlier about why we were installing tSQLt into the archive database when it had no testable code.

Conclusions

Using a view as an interface in this scenario does work, and is elegant, but for a large number of tables it may seem like a lot of work for little or no additional benefit. One could almost argue that this is a case of the unit-testing tail wagging the working-code dog. Yes, having the views in place as an interface might provide more flexibility in the future, but does this make us guilty of non-agile “just in case” design?  We still have a hard-coded database dependency, we’ve just moved it from the sproc to the view.

Calling FakeTable in the other database means writing less code which means I can get the product to market quicker whilst still maintaining good test coverage. Yes, I have now introduced a database dependency into my tests – but by the nature of T-SQL, if we are referencing two databases we always end up with some hard-coded dependencies somewhere.

In conclusion, I think that either pattern is valid. Like so many things in our industry, the correct answer is “it depends”. If you want the flexibility and de-coupling that views offer, go with the interface pattern. If there are lots of tables where the views could be written as SELECT * FROM OtherDB.dbo.TableName and you’re prepared to accept more hard-coded database dependency against writing less code, go with the remote call to FakeTable.

Comments on this entry are closed.

{ 1 trackback }

Previous post:

Next post: