Poor Execution Plan While Ripping Large XML in SQL Server 2008

A few days ago, I ran into a very frustrating problem while trying to rip a large XML document into a table-structure in T-SQL on 2008.  I am setting up a new procedure to process large files (Download Sample File) that we get from the market into our databases.  These files are each about 92K lines of XML that I need to get into a table-structure, for further processing.  So, I start with my normal X-Path style SQL:

DECLARE @priceDataString		VARCHAR(MAX),
		@priceDataXML			XML
 
SELECT
	@priceDataString = BulkColumn
FROM
	OPENROWSET(BULK'C:\Temp\EnergyPrices92KLines.xml', SINGLE_BLOB) fileData
 
SET @priceDataXML = @priceDataString
 
IF OBJECT_ID('TempDB..#extractedData') IS NOT NULL
		DROP TABLE [#extractedData]
 
CREATE TABLE [#extractedData]
	(
	[deliveryDate]			CHAR(10),
	[hourEnding]			CHAR(5),
	[settlementPoint]		VARCHAR(50),
	[price]					DECIMAL(20,13),
	[dstFlag]				CHAR(1)
	)
 
INSERT INTO
	[#extractedData]
	(
	[deliveryDate],
	[hourEnding],
	[settlementPoint],
	[price],
	[dstFlag]
	)
SELECT
	ref.value('DeliveryDate[1]', 'CHAR(10)') AS deliveryDate,
	ref.value('HourEnding[1]', 'CHAR(5)') AS hourEnding,
	ref.value('SettlementPoint[1]', 'VARCHAR(50)') AS settlementPoint,
	ref.value('SettlementPointPrice[1]', 'DECIMAL(20,13)') AS price,
	ref.value('DSTFlag[1]', 'CHAR(1)') AS dstFlag
FROM
	@priceDataXML.nodes ('/EnergyPrices/EnergyPrice') R(ref)

Normally, in my experience, this type of code would execute in a few seconds, making it an efficient way to rip through XML.  This time was different though.  This time, the clock kept ticking, after playing with it a few times, assuming I had done something dreadfully wrong, I finally just let it go all the way until it finished.  It took 3 hours to complete this query on my work desktop.  I’m sure it is clear that this is unacceptable.  So, I start troubleshooting.

My first step in troubleshooting was to remove the “INSERT INTO [#extractedData]” clause, just to see how slowly it is returning each row.  To my surprise, the entire process returns in 3 seconds.  That is quite a change from 3 hours to 3 seconds, so I at least know that the problem is that a bad plan is being chosen due to the insert.  I’m confusing the optimizer somehow.

Now, I know I could have just went into SSIS, or had the developers rip the XML in C# before passing it on to me, but this SHOULD work, and I wanted to know why it wasn’t. So, I jump on Twitter and send out a “#sqlhelp” for anyone who can help me think this one through.  Right away, Paul Randal (Blog | Twitter), Jonathan Kehayias (Blog | Twitter), Robert Davis (Blog | Twitter) all jump in to try to help me sort out why this strange behavior is occurring.  They each gave me some great suggestions to try, but nothing was working.  Jonathan tried out my exact code, to try to replicate my problem, but it wasn’t working the same way for him, his was returning in just a few seconds.

At this point, I was really confused, we had tried every thing we could think of, and could only surmise that it was something with my configuration that was causing the problem.  But, I went ahead and tried my exact query on one of my 2008 R2 instances and it worked perfectly!  Just like on Jonathan’s server.  This made me a bit relieved, since it didn’t seem to be my code that was bad, but this left me even more baffled as to why the optimizer would interpret my code so poorly on 2008.  Jonathan is opening a case for this and I will post a link to that once he has done so.

Ultimately, Jonathan had another suggestion to try, one that made the query perform optimally on 2008 as well, I will post that here, in case anyone else runs into the same problem:

DECLARE @priceDataString		VARCHAR(MAX),
		@priceDataXML			XML
 
SELECT
	@priceDataString = BulkColumn
FROM
	OPENROWSET(BULK'C:\Temp\EnergyPrices92KLines.xml', SINGLE_BLOB) fileData
 
SET @priceDataXML = @priceDataString
 
IF OBJECT_ID('TempDB..#sourceData') IS NOT NULL
	DROP TABLE [#sourceData]
 
CREATE TABLE [#sourceData]
	(
	[xmlData] XML
	)
 
INSERT INTO
	[#sourceData]
	(
	[xmlData]
	)
VALUES
	(
	@priceDataXML
	)
 
IF OBJECT_ID('TempDB..#extractedData') IS NOT NULL
	DROP TABLE [#extractedData]
 
CREATE TABLE [#extractedData]
	(
	[deliveryDate]			CHAR(10),
	[hourEnding]			CHAR(5),
	[settlementPoint]		VARCHAR(50),
	[price]					DECIMAL(20,13),
	[dstFlag]				CHAR(1)
	)
 
INSERT INTO
	[#extractedData]
	(
	[deliveryDate],
	[hourEnding],
	[settlementPoint],
	[price],
	[dstFlag]
	)
SELECT
	ref.value('(EnergyPrice/DeliveryDate)[1]', 'CHAR(10)') AS deliveryDate,
	ref.value('(EnergyPrice/HourEnding)[1]', 'CHAR(5)') AS hourEnding,
	ref.value('(EnergyPrice/SettlementPoint)[1]', 'VARCHAR(50)') AS settlementPoint,
	ref.value('(EnergyPrice/SettlementPointPrice)[1]', 'DECIMAL(20,13)') AS price,
	ref.value('(EnergyPrice/DSTFlag)[1]', 'CHAR(1)') AS dstFlag
FROM
	(
	SELECT
		ref.query('.') AS ref
	FROM
		[#sourceData]
 
			CROSS APPLY [xmlData].nodes ('/EnergyPrices/EnergyPrice') R(ref)
	) AS SourceData

Ripping XML in T-SQL can be a huge chore sometimes, but it makes it even worse when the database engine is not behaving the way that you expect it to.  I hope this post can save someone else from the headaches I had with this sort of code.  Huge thanks to Jonathan for his continued help on troubleshooting this annoying problem.

Precision in Date/Time Data Types

There is a general misconception, that is common across many different programming/scripting languages, regarding Date/Time data types, their precision, and how this differs in theory versus practice.  For the sake of this article, I will be discussing the following data types: DATETIME in T-SQL, DATETIME2(7) in T-SQL, and DateTime in C#.NET.

Data Type Precision (Theoretical Precision)

Data Type Theoretical Precision
SQL DATETIME 3.33 ms
SQL DATETIME2(7) 100 ns
C# DateTime 100 ns

When viewing the information presented above, the natural assumption is that each of these data types will give you a unique value at intervals of their precision.  For example, if you were using a SQL DATETIME and queried the time every 1 ms, you would expect to see a unique value roughly once out of every three queries.  The same would be expected of the higher precision data types, should they be queried on relatively similar timelines.

This, unfortunately, is not how Date/Time data types work.  This theoretical precision is representative only of what the data type structure is physically capable of holding, not of what the variable will ever practically contain.

Operating System Precision (Practical Precision)

The important factor that most programmers forget is that every program/query/script is running in the context of some Operating System and these Operating Systems are further running in the context of a processor and RAM.  The processor, while very fast, is consumed with many different actions happening to handle multitasking loads, and thus cannot dedicate a large percentage of its cycles to updating a clock.  The way processors get around this issue is by updating a portion of memory with the current date and time, at some predetermined interval.  The Operating System exposes this data through any one of several counters.  When a program requests the current time from the OS, the OS gives the program the latest reading of the “current” time.  In this way, the counter can be said to be highly, but not perfectly accurate, and highly precise to within some threshold (which we will see next).

When a program requests the current date and time from the operating system, it does so through a built-in function/method.  SQL DATETIME can get its data from GETDATE(), SQL DATETIME2(7) should get its data from SYSTEMDATETIME(), and C# DateTime can get its data from DateTime.Now.  GETDATE() hits an OS counter that has a precision of approximately 15 ms, while SYSTEMDATETIME() and DateTime.Now hit an OS counter that has a precision of approximately 1 ms.

Q.E.D., when you call GETDATE(), you will receive a data value that, while capable of holding a time down to a precision of 3.33 ms, will actually only be precise to about 15 ms (in Windows 2000).  In our previous example of querying every 1 ms, you would only see a unique value roughly once out of every 15 queries, as opposed to the 3 we previously would have assumed.

Demonstration

In order to see the effect we have discussed, the following code will capture the number of times we saw unique values and the average time between unique values.

-- SQL GETDATE
DECLARE @TimeStart DATETIME
DECLARE @Time DATETIME
DECLARE @TimeEnd DATETIME
DECLARE @I INT
DECLARE @Count INT
 
SET @I = 0
SET @Count = 0
SET @TimeStart = GETDATE()
SET @Time = @TimeStart
 
WHILE @I < 10000000
      BEGIN
            SET @TimeEnd = GETDATE()
            IF @TimeEnd != @Time
                  BEGIN
                        SET @Count = @Count + 1
                        SET @Time = @TimeEnd
                  END
            SET @I = @I + 1
      END
PRINT CAST(@Count AS VARCHAR) + ' unique values'
PRINT CAST(DATEDIFF(millisecond, @TimeStart, @TimeEnd) / CAST(@Count AS REAL) AS VARCHAR) + ' milliseconds'
 -- SQL SYSTEMDATETIME
DECLARE @TimeStart DATETIME2
DECLARE @Time DATETIME2
DECLARE @TimeEnd DATETIME2
DECLARE @I INT
DECLARE @Count INT
 
SET @I = 0
SET @Count = 0
SET @TimeStart = SYSDATETIME()
SET @Time = @TimeStart
 
WHILE @I < 10000000
      BEGIN
            SET @TimeEnd = SYSDATETIME()
            IF @TimeEnd != @Time
                  BEGIN
                        SET @Count = @Count + 1
                        SET @Time = @TimeEnd
                  END
            SET @I = @I + 1
      END
 
PRINT CAST(@Count AS VARCHAR) + ' unique values'
PRINT CAST(DATEDIFF(microsecond, @TimeStart, @TimeEnd) / CAST(@Count AS REAL) AS VARCHAR) + ' microseconds'
// C# DateTime
DateTime timeStart;
DateTime time;
DateTime timeEnd = DateTime.Now;
 
int i = 0;
int count = 0;
timeStart = DateTime.Now;
time = timeStart;
 
while(i < 10000000)
{
       timeEnd = DateTime.Now;
       if(timeEnd != time)
       {
              count++;
              time = timeEnd;
       }
       i++;
}
 
Console.WriteLine("{0} unique values",count);
Console.WriteLine("{0} microseconds", ((timeEnd.Ticks - timeStart.Ticks) / count) / (TimeSpan.TicksPerMillisecond / 1000));

From executing these I found the following results:

Data Type Theoretical Precision Practical Precision
SQL DATETIME (on Win2000) 3.33 ms 15.5556 ms
SQL DATETIME (on Win2K3) 3.33 ms 3.33333 ms
SQL DATETIME2(7) 100 ns 1003.71 µs
C# DateTime 100 ns 1000.21 µs

From these results, we can discern a couple things.  First, the counter that SQL Server accesses in Windows 2000 has a practical precision of approximately 15 ms, while the counter that SQL Server accesses in Windows 2003 (and beyond) has a practical precision of approximately 3.33 ms, which is basically the same as the theoretical precision.  Secondly, In both the case of SQL DATETIME2(7) and C# Datetime, the practical precision is 1 ms, which falls woefully short of the 100 ns theoretical precision.

Conclusion

Developers should be aware of what their data types and functions are doing under the surface.  In the case of SQL DATETIME2 and C# DateTime, developers must keep in mind that the true precision of the system date and time retrieved through those data types only has a precision of 1 ms.

A Few Changes in SQL Server 2012 (DBCC LOGINFO and DBCC CHECKDB)

Last week I put my first SQL Server 2012 instance into production, and therefore, started applying some of my Production scripts and jobs to that server. Upon doing so, I quickly noticed that two of my jobs were getting errors, two jobs that had been solid for years on other servers, so I took a look.

DBCC LOGINFO

First, on DBCC LOGINFO, the SQL team added a single field: RecoveryUnitId. I have tried to discover what this field is for, but have had no luck as of yet. Regardless, it means that my Check VLF Density script will have to change. This is a script that I run across all of my instances occasionally, but specifically whenever we have a DR site outage, to make sure that my Transaction Logs are not getting too full (due to their SQL Mirror Failover Partners being inaccessible). I have included the script below, it is not fancy, but it gets the job done. Note: It has a threshold set to only display databases whose VLF “density” is above 60%, this is configurable at the top of the script.

-- Only shows databases with a VLF density greater than 60% (configurable via @minimumDensity)
 
DECLARE @minimumDensity INT
SET @minimumDensity = 60
 
DECLARE @databaseList TABLE
(
    [database]            VARCHAR(256),
    [executionOrder]    [int] IDENTITY(1,1) NOT NULL
)
 
DECLARE @vlfDensity    TABLE
(
    [server]        VARCHAR(256),
    [database]        VARCHAR(256),
    [density]        DECIMAL(7,2),
    [unusedVLF]        INT,
    [usedVLF]        INT,
    [totalVLF]        INT
)
 
-- Starting in SQL Server 2012, an extra column was added to LogInfo, named RecoveryUnitId
-- Need to accomodate SQL Server 2012 (version 11.0)
DECLARE @versionString            VARCHAR(20),
        @serverVersion            DECIMAL(10,5),
        @sqlServer2012Version    DECIMAL(10,5)
 
SET        @versionString    = CAST(SERVERPROPERTY('productversion') AS VARCHAR(20))
SET        @serverVersion = CAST(LEFT(@versionString,CHARINDEX('.', @versionString)) AS DECIMAL(10,5))
SET        @sqlServer2012Version = 11.0 -- SQL Server 2012
 
IF(@serverVersion &gt;= @sqlServer2012Version)
    BEGIN
        -- Use the new version of the table  
        DECLARE @logInfoResult2012 TABLE
            (
            [RecoveryUnitId]    INT NULL,
            [FileId]            INT NULL,
            [FileSize]            BIGINT NULL,
            [StartOffset]        BIGINT NULL,
            [FSeqNo]            INT NULL,
            [Status]            INT NULL,
            [Parity]            TINYINT NULL,
            [CreateLSN]            NUMERIC(25, 0) NULL
            )
    END  
ELSE  
    BEGIN
        -- Use the old version of the table
        DECLARE @logInfoResult2008 TABLE
            (
            [FileId]            INT NULL,
            [FileSize]            BIGINT NULL,
            [StartOffset]        BIGINT NULL,
            [FSeqNo]            INT NULL,
            [Status]            INT NULL,
            [Parity]            TINYINT NULL,
            [CreateLSN]            NUMERIC(25, 0) NULL
            )
 
    END
 
DECLARE
    @currentDatabaseID    INT,
    @maxDatabaseID        INT,
    @dbName                VARCHAR(256),
    @density            DECIMAL(7,2),
    @unusedVLF            INT,
    @usedVLF            INT,
    @totalVLF            INT
 
INSERT INTO
    @databaseList
    (
    [database]
    )
SELECT
    [name]
FROM
    [sys].[sysdatabases]
 
SELECT
    @currentDatabaseID = MIN([executionOrder]),
    @maxDatabaseID = MAX([executionOrder])
FROM
    @databaseList
 
WHILE @currentDatabaseID &lt;= @maxDatabaseID BEGIN SELECT @dbName = [database] FROM @databaseList WHERE [executionOrder] = @currentDatabaseID IF(@serverVersion &gt;= @sqlServer2012Version)
            BEGIN
                -- Use the new version of the table 
                DELETE
                    @logInfoResult2012
                FROM
                    @logInfoResult2012
 
                INSERT INTO
                    @logInfoResult2012
                EXEC('DBCC LOGINFO([' + @dbName + '])')
 
                SELECT
                    @unusedVLF = COUNT(*)
                FROM
                    @logInfoResult2012
                WHERE
                    [Status] = 0
 
                SELECT
                    @usedVLF = COUNT(*)
                FROM
                    @logInfoResult2012
                WHERE
                    [Status] = 2
 
                SELECT
                    @totalVLF = COUNT(*)
                FROM
                    @logInfoResult2012
            END          
        ELSE
            BEGIN
                -- Use the old version of the table 
                DELETE
                    @logInfoResult2008
                FROM
                    @logInfoResult2008
 
                INSERT INTO
                    @logInfoResult2008
                EXEC('DBCC LOGINFO([' + @dbName + '])')
 
                SELECT
                    @unusedVLF = COUNT(*)
                FROM
                    @logInfoResult2008
                WHERE
                    [Status] = 0
 
                SELECT
                    @usedVLF = COUNT(*)
                FROM
                    @logInfoResult2008
                WHERE
                    [Status] = 2
 
                SELECT
                    @totalVLF = COUNT(*)
                FROM
                    @logInfoResult2008
            END      
 
        SELECT
            @density = CONVERT(DECIMAL(7,2),@usedVLF) / CONVERT(DECIMAL(7,2),@totalVLF) * 100
 
        INSERT INTO
            @vlfDensity
            (
            [server],
            [database],
            [density],
            [unusedVLF],
            [usedVLF],
            [totalVLF]
            )
        VALUES
            (
            @@SERVERNAME,
            @dbName,
            @density,
            @unusedVLF,
            @usedVLF,
            @totalVLF
            )
 
        SET @currentDatabaseID = @currentDatabaseID + 1
    END
 
SELECT
    [server],
    [database],
    [density],
    [unusedVLF],
    [usedVLF],
    [totalVLF]
FROM
    @vlfDensity
WHERE
    [density] &gt; @minimumDensity
ORDER BY
    [density] DESC

DBCC CHECKDB

Secondly, on DBCC CHECKDB WITH TABLERESULTS, the SQL team added several columns and replaced a couple others. This required me to change a critical stored procedure that I use regularly to check those databases that, for a multitude of reasons, do not fit into my normal Backup –> Restore Elsewhere –> DBCC CHECKDB –> DELETE methodology (outlined here). This stored procedure is called by a SQL Agent Job and sends me e-mails with the results of its findings. I will also need to make this change to my other method, but I will be moving that process to a different server and will probably do that change at that time (since I cannot restore my 2012 databases to that 2008 R2 instance anyway).

Below, I have listed the stored procedure that I have on each database instance. Note: If you want to use this script, you would need to make sure you have DBMail Setup and Configured, and then supply the appropriate DBMail Profile and Recipient e-mail address(es) to the script via two variables declared at the top of the stored procedure. Furthermore, to narrow down the databases against which the procedure is ran, you can change the query that fills the @databasesToAnalyze table variable. Both of these sections are called out in comment lines at the beginning of the stored procedure.

Lastly, I would like to thank Robert Davis (Blog | Twitter) and Roji Thomas (Blog | Twitter) for helping me find a couple columns I left out, sometimes a second or third pair of eyes can really help. This is yet another reason why the SQL Twitter community (especially via the #sqlhelp hashtag) is so valuable.

USE [master]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- EXEC [administrator].[usp_DBCCCheckDB_SpecialDatabases]
-- =============================================
-- Author:        Brad Hoff
-- Create date: 2010_05_23
-- Change Log:    2012_08_29 - Brad Hoff - Changing sproc to accomodate changes in SQL Server 2012 DBCC CHECKDB command
--
-- Description:    Performs DBCC CheckDB against a subset of the databases
--                in the current instance.  This is to cover the databases
--                which cannot be checked via the normal weekly offload
--                process
-- =============================================
ALTER PROCEDURE [administrator].[usp_DBCCCheckDB_SpecialDatabases]
 
AS
BEGIN
    SET NOCOUNT ON;
 
    DECLARE    @currentDatabaseIdentifier    INT,
            @maxDatabaseIdentifier        INT,
            @currentDatabaseName        VARCHAR(256),
            @instanceName                VARCHAR(100),
            @statusMsg                    VARCHAR(MAX),
            @statusCode                    BIT,
            @msgSubject                    VARCHAR(200),
            @recipients                    NVARCHAR(MAX),
            @dbMailProfile                VARCHAR(64)
 
------------------------------------------------------------
 --=======================================================---
 --========================================================--
 ------------------------------------------------------------
    SET    @recipients = 'ENTER YOUR E-MAIL ADDRESSES HERE, SEPERATED BY SEMICOLONS'
    SET    @dbMailProfile = 'ENTER YOUR DBMail PROFILE HERE, DBMAIL IS REQUIRED FOR THIS SPROC'
------------------------------------------------------------
 --=======================================================---
 --========================================================--
 ------------------------------------------------------------
    SET @instanceName = @@SERVERNAME
    SET    @statusMsg = 'DBCC CheckDB Process is starting for select special databases on ' + @instanceName + ' at ' + CONVERT(VARCHAR,GETDATE(),120) + '
 
'
    SET    @statusCode = 0
 
    -- Starting in SQL Server 2012, several columns changed for the DBCC Check DB With Table Results command
    -- Need to accomodate SQL Server 2012 (version 11.0)
    DECLARE @versionString            VARCHAR(20),
            @serverVersion            DECIMAL(10,5),
            @sqlServer2012Version    DECIMAL(10,5)
 
    SET        @versionString    = CAST(SERVERPROPERTY('productversion') AS VARCHAR(20))
    SET        @serverVersion = CAST(LEFT(@versionString,CHARINDEX('.', @versionString)) AS DECIMAL(10,5))
    SET        @sqlServer2012Version = 11.0 -- SQL Server 2012
 
    DECLARE    @databasesToAnalyze    TABLE
    (
    [identifier]    INT    IDENTITY(1,1)    NOT NULL,
    [databaseName]    VARCHAR(256)        NOT NULL
    )
 
------------------------------------------------------------
--=======================================================---
--========================================================--
------------------------------------------------------------
-- REPLACE THIS SECTION WITH WHATEVER LOGIC YOU NEED FOR YOUR ENVIRONMENT
-- BY DEFAULT, THIS SCRIPT JUST CHECKS ALL DATABASES OTHER THAN TEMPDB
 
    INSERT INTO
        @databasesToAnalyze
        (
        [databaseName]
        )
    SELECT
        [name]
    FROM
        [sys].[databases]
    WHERE
        [name] NOT IN ('tempdb')
 
 -- END DATABASE SELECTION AREA
 ------------------------------------------------------------
 --=======================================================---
 --========================================================--
 ------------------------------------------------------------
 
    IF NOT EXISTS (
                    SELECT
                        *
                    FROM
                        [sys].[tables]
                    WHERE
                        [name] = 'tbl_CheckDBResults'
                    )
        BEGIN
            IF(@serverVersion &gt;= @sqlServer2012Version)
                BEGIN
                    CREATE TABLE [dbo].[tbl_CheckDBResults]
                    (
                    [ServerName] VARCHAR(100) NULL,
                    [DatabaseName] VARCHAR(256) NULL,
                    [Error] INT NULL,
                    [Level] INT NULL,
                    [State] INT NULL,
                    [MessageText] VARCHAR(7000) NULL,
                    [RepairLevel] INT NULL,
                    [Status] INT NULL,
                    [DbId] INT NULL,
                    [DbFragId] INT NULL,
                    [ObjectId] INT NULL,
                    [IndexId] INT NULL,
                    [PartitionId] INT NULL,
                    [AllocUnitId] INT NULL,
                    [RidDbld] INT NULL,
                    [RidPruId] INT NULL,
                    [File] INT NULL,
                    [Page] INT NULL,
                    [Slot] INT NULL,
                    [RefDBId] INT NULL,
                    [RefPruId] INT NULL,
                    [RefFile] INT NULL,
                    [RefPage] INT NULL,
                    [RefSlot] INT NULL,
                    [Allocation] INT NULL,
                    [insert_date] DATETIME NULL
                    ) ON [PRIMARY]    
                END
            ELSE
                BEGIN
                    CREATE TABLE [dbo].[tbl_CheckDBResults]
                    (
                    [ServerName] VARCHAR(100) NULL,
                    [DatabaseName] VARCHAR(256) NULL,
                    [Error] INT NULL,
                    [Level] INT NULL,
                    [State] INT NULL,
                    [MessageText] VARCHAR(7000) NULL,
                    [RepairLevel] INT NULL,
                    [Status] INT NULL,
                    [DbId] INT NULL,
                    [Id] INT NULL,
                    [IndId] INT NULL,
                    [PartitionId] INT NULL,
                    [AllocUnitId] INT NULL,
                    [File] INT NULL,
                    [Page] INT NULL,
                    [Slot] INT NULL,
                    [RefFile] INT NULL,
                    [RefPage] INT NULL,
                    [RefSlot] INT NULL,
                    [Allocation] INT NULL,
                    [insert_date] DATETIME NULL
                    ) ON [PRIMARY]    
                END    
        END
 
    -- Setup default for Insert_Date field
    IF NOT EXISTS (SELECT [name] FROM [sys].[default_constraints] WHERE [name] = 'DF_CheckDBResult_insert_date')
        ALTER TABLE [dbo].[tbl_CheckDBResults] ADD  CONSTRAINT [DF_CheckDBResult_insert_date]  DEFAULT (getdate()) FOR [insert_date]
 
    IF NOT EXISTS (
                    SELECT
                        *
                    FROM
                        [sys].[tables]
                    WHERE
                        [name] = 'tbl_CheckDBResultSummary'
                    )
        BEGIN
            IF(@serverVersion &gt;= @sqlServer2012Version)
                BEGIN
                    CREATE TABLE [dbo].[tbl_CheckDBResultSummary]
                    (
                    [ServerName] VARCHAR(100) NULL,
                    [DatabaseName] VARCHAR(256) NULL,
                    [Error] INT NULL,
                    [Level] INT NULL,
                    [State] INT NULL,
                    [MessageText] VARCHAR(7000) NULL,
                    [RepairLevel] INT NULL,
                    [Status] INT NULL,
                    [DbId] INT NULL,
                    [DbFragId] INT NULL,
                    [ObjectId] INT NULL,
                    [IndexId] INT NULL,
                    [PartitionId] INT NULL,
                    [AllocUnitId] INT NULL,
                    [RidDbld] INT NULL,
                    [RidPruId] INT NULL,
                    [File] INT NULL,
                    [Page] INT NULL,
                    [Slot] INT NULL,
                    [RefDBId] INT NULL,
                    [RefPruId] INT NULL,
                    [RefFile] INT NULL,
                    [RefPage] INT NULL,
                    [RefSlot] INT NULL,
                    [Allocation] INT NULL,
                    [insert_date] DATETIME NULL
                    ) ON [PRIMARY]
                END
            ELSE
                BEGIN
                    CREATE TABLE [dbo].[tbl_CheckDBResultSummary]
                    (
                    [ServerName] VARCHAR(100) NULL,
                    [DatabaseName] VARCHAR(256) NULL,
                    [Error] INT NULL,
                    [Level] INT NULL,
                    [State] INT NULL,
                    [MessageText] VARCHAR(7000) NULL,
                    [RepairLevel] INT NULL,
                    [Status] INT NULL,
                    [DbId] INT NULL,
                    [Id] INT NULL,
                    [IndId] INT NULL,
                    [PartitionId] INT NULL,
                    [AllocUnitId] INT NULL,
                    [File] INT NULL,
                    [Page] INT NULL,
                    [Slot] INT NULL,
                    [RefFile] INT NULL,
                    [RefPage] INT NULL,
                    [RefSlot] INT NULL,
                    [Allocation] INT NULL,
                    [insert_date] DATETIME NULL
                    ) ON [PRIMARY]
                END
        END
 
    SELECT
        @maxDatabaseIdentifier = MAX([identifier]),
        @currentDatabaseIdentifier = MIN ([identifier])
    FROM
        @databasesToAnalyze
 
    WHILE (@currentDatabaseIdentifier &lt;= @maxDatabaseIdentifier) BEGIN SELECT @currentDatabaseName = [databaseName] FROM @databasesToAnalyze WHERE [identifier] = @currentDatabaseIdentifier DELETE [dbo].[tbl_CheckDBResults] FROM [dbo].[tbl_CheckDBResults] WHERE [ServerName] = @instanceName AND [DatabaseName] = @currentDatabaseName DELETE [dbo].[tbl_CheckDBResultSummary] FROM [dbo].[tbl_CheckDBResultSummary] WHERE [ServerName] = @instanceName AND [DatabaseName] = @currentDatabaseName -- Start DBCC Check DB IF(@serverVersion &gt;= @sqlServer2012Version)
                BEGIN
                    EXEC('
                    INSERT INTO
                        [dbo].[tbl_CheckDBResults]
                        (
                        [Error],
                        [Level],
                        [State],
                        [MessageText],
                        [RepairLevel],
                        [Status],
                        [DbId],
                        [DbFragId],
                        [ObjectId],
                        [IndexId],
                        [PartitionId],
                        [AllocUnitId],
                        [RidDbld],
                        [RidPruId],
                        [File],
                        [Page],
                        [Slot],
                        [RefDBId],
                        [RefPruId],
                        [RefFile],
                        [RefPage],
                        [RefSlot],
                        [Allocation]
                        )
                    EXEC(''DBCC CHECKDB(''''' + @currentDatabaseName + ''''') WITH TABLERESULTS'')')
                END
            ELSE
                BEGIN
                    EXEC('
                    INSERT INTO
                        [dbo].[tbl_CheckDBResults]
                        (
                        [Error],
                        [Level],
                        [State],
                        [MessageText],
                        [RepairLevel],
                        [Status],
                        [DbId],
                        [Id],
                        [IndId],
                        [PartitionId],
                        [AllocUnitId],
                        [File],
                        [Page],
                        [Slot],
                        [RefFile],
                        [RefPage],
                        [RefSlot],
                        [Allocation]
                        )
                    EXEC(''DBCC CHECKDB(''''' + @currentDatabaseName + ''''') WITH TABLERESULTS'')')
                END
 
            UPDATE
                [dbo].[tbl_CheckDBResults]
            SET
                [ServerName] = @instanceName,
                [DatabaseName] = @currentDatabaseName
            WHERE
                [ServerName] IS NULL
 
            INSERT INTO
                [dbo].[tbl_CheckDBResultSummary]
            SELECT
                *
            FROM
                [dbo].[tbl_CheckDBResults]
            WHERE
                    [ServerName] = @instanceName
                AND    [DatabaseName] = @currentDatabaseName
                AND [MessageText] LIKE 'CHECKDB%'
 
            -- Start Analysis
            IF (@currentDatabaseName = 'master')
                BEGIN
                    IF EXISTS (
                                SELECT
                                    *
                                FROM
                                    [dbo].[tbl_CheckDBResultSummary]
                                WHERE
                                        [ServerName] = @instanceName
                                    AND    [DatabaseName] = @currentDatabaseName
                                    AND [MessageText] LIKE 'CHECKDB found 0 allocation errors and 0 consistency errors in database ''master''%'
 
                                )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' Passed.
'
                            -- if successful, we need to update the extended property at the actual database
                            EXEC('EXEC [master].[administrator].[usp_LastDBCCCheckDB_Set] [' + @currentDatabaseName + ']')
                        END    -- Condition: A passing entry for this DB in DBCC Summary
                    ELSE IF EXISTS (
                                    SELECT
                                        [ServerName]
                                    FROM
                                        [dbo].[tbl_CheckDBResultSummary]
                                    WHERE
                                            [ServerName] = @instanceName
                                        AND    [DatabaseName] = @currentDatabaseName
                                    )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' Failed! (Check the tbl_CheckDBResults table)
'
                            SET @statusCode = 1
                        END    -- Condition: No passing entry for this DB in DBCC Summary
                    ELSE
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' was not properly performed (Check Configuration)
'
                            SET @statusCode = 1
                        END    -- Condition: No entry whatsoever for this DB in DBCC Summary        
 
                    IF EXISTS (
                                SELECT
                                    *
                                FROM
                                    [dbo].[tbl_CheckDBResultSummary]
                                WHERE
                                        [ServerName] = @instanceName
                                    AND    [DatabaseName] = @currentDatabaseName
                                    AND [MessageText] LIKE 'CHECKDB found 0 allocation errors and 0 consistency errors in database ''mssqlsystemresource''%'
 
                                )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + 'mssqlsystemresource' + ' Passed.
'
                            -- if successful, we need to update the extended property at the actual database
                            EXEC('EXEC [master].[administrator].[usp_LastDBCCCheckDB_Set] [' + @currentDatabaseName + ']')
                        END    -- Condition: A passing entry for this DB in DBCC Summary
                    ELSE IF EXISTS (
                                    SELECT
                                        [ServerName]
                                    FROM
                                        [dbo].[tbl_CheckDBResultSummary]
                                    WHERE
                                            [ServerName] = @instanceName
                                        AND    [DatabaseName] = @currentDatabaseName
                                    )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + 'mssqlsystemresource' + ' Failed! (Check the tbl_CheckDBResults table)
'
                            SET @statusCode = 1
                        END    -- Condition: No passing entry for this DB in DBCC Summary
                    ELSE
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + 'mssqlsystemresource' + ' was not properly performed (Check Configuration)
'
                            SET @statusCode = 1
                        END    -- Condition: No entry whatsoever for this DB in DBCC Summary                            
                END
            ELSE
                BEGIN
                    IF EXISTS (
                                SELECT
                                    *
                                FROM
                                    [dbo].[tbl_CheckDBResultSummary]
                                WHERE
                                        [ServerName] = @instanceName
                                    AND    [DatabaseName] = @currentDatabaseName
                                    AND [MessageText] LIKE 'CHECKDB found 0 allocation errors and 0 consistency errors in database %'
                                )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' Passed.
'
                            -- if successful, we need to update the extended property at the actual database
                            -- I am removing this for use by outside DBAs
                            -- EXEC('EXEC [master].[administrator].[usp_LastDBCCCheckDB_Set] [' + @currentDatabaseName + ']')
                        END    -- Condition: A passing entry for this DB in DBCC Summary
                    ELSE IF EXISTS (
                                    SELECT
                                        [ServerName]
                                    FROM
                                        [dbo].[tbl_CheckDBResultSummary]
                                    WHERE
                                            [ServerName] = @instanceName
                                        AND    [DatabaseName] = @currentDatabaseName
                                    )
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' Failed! (Check the tbl_CheckDBResults table)
'
                            SET @statusCode = 1
                        END    -- Condition: No passing entry for this DB in DBCC Summary
                    ELSE
                        BEGIN
                            SET    @statusMsg = @statusMsg + 'DBCC FOR ' + @currentDatabaseName + ' was not properly performed (Check Configuration)
'
                            SET @statusCode = 1
                        END    -- Condition: No entry whatsoever for this DB in DBCC Summary
                END
 
            SET @currentDatabaseIdentifier = @currentDatabaseIdentifier + 1
        END
 
    SET    @statusMsg = @statusMsg + '
DBCC CheckDB Process is starting for select special databases on ' + @instanceName + ' at ' + CONVERT(VARCHAR,GETDATE(),120) + '
'
 
    IF @statusCode = 0
        BEGIN
            SET @msgSubject = 'SUCCESS - DBCC CheckDB for special databases on ' + @instanceName
        END    -- Condition: There were no errors or failures in the consistency checking of this instance
    ELSE
        BEGIN
            SET @msgSubject = 'FAILURE - DBCC CheckDB for special databases on ' + @instanceName
        END    -- Condition: At least one consistency check either failed or resulted in an error
 
    EXEC [msdb].[dbo].[sp_send_dbmail]
        @profile_name = @dbMailProfile, 
        @recipients = @recipients,
        @body = @statusMsg,
        @subject = @msgSubject,
        @body_format = 'HTML';
 
END

I hope these scripts and information can be a help to someone.

Check Your BIOS Settings

In March, I had just received three new servers for a new cluster (Dell R910s) I was creating for our flagship application’s database.  I had just gotten the cluster setup and had moved a copy of our Production database to the cluster to test it out before we migrated to it officially.

The next week I was sitting in the SQLskills Immersion Event in Tampa when I get a page and e-mail with a Sev 24, Error 832.  This error can mean something truly bad has happened, that the memory has corrupted the data while in the Buffer Cache.

I kicked off a CheckDB to see if the corruption was anywhere else in the database, but since I was sitting in the classroom with Paul Randal (Blog | Twitter), the corruption guru himself, I shot him an e-mail letting him know what I was seeing and asking if he had any suggestions.  He suggested that the corruption probably wasn’t on disk, since the 832 specifically means that the corruption happened after the time the page was read into memory, so that was somewhat of a relief.  Since we suspected the box had bad memory, I flipped the instance to another node in the cluster and restarted the CheckDB.

Since this cluster was still in test, there was no real concern from a business perspective, but I wanted my infrastructure guys to test the box as soon as possible, in case there was a hardware issue that would delay the roll-out to production coming up.  So, they started running memory tests as well as all the other Dell diagnostics.  We checked BIOS versions, Windows patches, memory over and over again, and nothing looked wrong, at all.

Finally, after working with Dell Support for several hours, they found that the problem was two BIOS settings, that would normally be disabled for our servers, were enabled.  C-State (CPU States) and C1E (Processor Core Speed and Voltage) were both configured, which was causing problems with the stability of the RAM.  These had been enabled because the server in question had been bought off of the Dell Outlet, and the previous owner must have enabled them, because they are not on by default.

After they disabled those two settings in the BIOS, the server has been stable ever since.  It certainly gave me a scare, but I learned a lesson from it: double-check your BIOS settings, and make sure you know what those settings really mean.

Intellisense Space-Completion Change in SSMS 2012

Microsoft added Intellisense to SSMS in 2008 and many of us have come to love this feature.  However, in 2012, one of my co-workers noticed a problem (from his perspective) in the Intellisense completion action.  The default configuration of SSMS 2012 is such that you can no longer complete an Intellisense selection using a “Space”.  I never noticed this issue because I use “Tab” to do my completions, which still works correctly.  My co-worker was further annoyed to find that there is no way to change this in the SSMS “Tools–>Options” dialog, as you might expect.  However, there is a fix available.  As noted in this StackOverflow Article, if you hit CTRL+ALT+SPACE, the input mode is changed and space works once again.  I hope this helps someone who may have been frustrated by this.

Problems with TCP Chimney Offload (TOE)

Over the past few years, several of my servers have had odd intermittent network connection issues where user sessions would “Time-Out” immediately, even when their timeout was set to 30 seconds (or more).  After digging into the problem from multiple angles with my network guys, I stumbled onto an article that showed promise.  I am re-publishing this information here for quick reference and in case someone else has a need to investigate this as a possible issue/solution.  Please see this TechNet entry for more information.

The TCP Offload Engine (TOE) is intended to alleviate problems with high connections by offloading certain workloads onto hardware.  While this is a great idea in theory, in practice it can sometimes cause more problems than solutions.  It is fairly non-interruptive to try turning the TCP Chimney Offload off to test if this is a cause of a strange network connection issue you may be having, so it is worth a try.

First, check to see if you have TCP Chimney Offload enabled:

netsh int tcp show global

A list of global parameters will be returned, look for the one that says, “Chimney Offload State”.  If it is set to anything other than “disabled”, try setting the parameter to “disabled”.  You will need to be in an elevated command prompt to issue the following command:

netsh int tcp set global chimney=disabled

After issuing this command, observe your system to see if you have a change, either for better or worse, with regard to your network connections.  Again, please see this TechNet entry for more information.

Command Line and Quotes: Fun with C#

Most DBAs don’t get to jump into programming often, but I came from a development background, so I try to use some legitimate code every once in a while.  It helps me keep that skill-set fresh.  Recently, in my company, we had a need for a unified solution for dealing with archiving, deleting, compressing, and so forth of various types of files.  Having an application that can be passed a bunch of parameters and can be triggered by a simple scheduled task is very useful in many aspects of IT.

We used to use ABAKT for these sort of operations, but the project was abandoned a long time ago and we wanted something that we could add to as we needed.  So, I had a few hours of downtime one weekend and decided to write one.  Everything went very well with the project and it worked almost flawlessly.  There was just one problem I ran into that stumped me for a couple days.

The problem cropped up when I was trying to pass a UNC to the application that had a space in it.  Such as:

\\someserver.mydomain\theShare\folder\some folder with spaces\anotherFolder\

Initially, I thought, well that’s no problem, just encapsulate the string in double-quotes and we are fine, as so:

"\\someserver.mydomain\theShare\folder\some folder with spaces\anotherFolder\"

However, the command line interface interprets that last \” as an escaped double-quote and NOT a closing double-quote to the beginning of the UNC.  This, in turn, causes my C# argument parser to see the string as 4 parameters, rather than 1 long one.  The parameters passed to my C# application would be:

  • \\someserver.mydomain\theShare\folder\some
  • folder
  • with
  • spaces\anotherFolder”

The way this issue presented really threw me off track several times.  I couldn’t figure out why my parameters were so messed up.  My solution was to escape out the last backslash, this worked perfectly:

"\\someserver.mydomain\theShare\folder\some folder with spaces\anotherFolder\\"

A little deviation from SQL stuff, but some useful information I wish I could have found when i was working on this project.

Fundamentals Matter

I am currently interviewing candidates for a new Junior DBA at my company.  Even though the position for which I am hiring is a junior position, I do expect the candidates to have a base level of technology, know the basics of server administration, and have some strong critical thinking skills.  So, to test the way each potential employee thinks, I ask a lot of intermediate to sometimes advanced SQL questions.  My thinking is that, 1) I want to know what level the candidate is really at, and 2) I want to discover how they respond to questions they haven’t faced before.  Have they dug down deep into the material, trying to learn as much as they could?  Have they been casual observers of technology?  Have they tried to test out ideas they’ve heard?  I certainly don’t consider my questions pass/fail, they are just an indication of where the person sitting across from me is in their technical career.

I have had several candidates in so far and a few of them have come to me via a recruiter.  After I had wrapped up with the recruiter regarding some of the candidates, I had mentioned to him that some of them, while completely qualified for this position, may want to stretch themselves a bit, because I see a lot of potential there.  He asked if i could clarify, so I started trying to put it into a concise idea.  I realized that the core of what I was trying to get across was simple, people who are starting into their technical career need to learn how to think about technology, not just learn the syntax and the base skills.  So, I came up with these suggestions, mostly specific to SQL Server, that I thought would get the gears turning about how to start digging.  These are not an end unto themselves, but simple stepping stones into building a technology paradigm; in this case, using the vehicle of SQL Server.

  1. Dig into how the syntax works.  Start by opening up SSMS and going through the different wizards.  Select different options and parameters, and then click the “Script” button at the top of the dialog to produce a script that you can read through.  Start working through it and find which parts of the syntax are new to you.  If you find a parameter you are not familiar with, or if you come across an entire group of keywords you have never used before, start researching them.  Find out what the code is actually doing.  Books Online is a great resource to start reading up on the syntax, but then expand your research and test it out for yourself.  Books Online DOES have errors in it, as do hundreds of blogs, forums, and twitter-feeds.  All of this information is coming from some human, double-check their work.
  2. Learn the first layer of Internals.  Read up on how the Transaction Log functions and the role it plays in the database.  Learn what VLFs are and how they work within the Transaction Log.  Learn the process data goes through when queried, updated, and deleted.  There is a lot going on here and understanding the basic principles will help you immensely.
  3. Start building a paradigm.  I like to think that there is no technology that we posses today that was not built up through a quasi-logical process, placing one building block on another along the way.  This enables me to build a paradigm that I can start plugging new information into.  The more robust and thorough my model is, the better I can assimilate new ideas.  If I come across something that doesn’t fit, I either have to recognize the failing in my paradigm and adjust my model to be more correct, or I have to find out why that new information is wrong.  Those are the only two options I see, either I don’t understand something as well as I need to, or the new information is not completely accurate.  It is up to me to find out which is true.  Your brain may not work the exact same way that mine does, but having some type of model in your head of how different systems interact, helps you predict functionality and troubleshoot anomalies.
  4. Start to recognize false information.  A lot of ideas get passed around the SQL community, and some of them are just straight-out wrong.  There are some DBAs who believe everything they read or hear, this leads to a lot of myths being shoveled back and forth that are not true, or are not true in all cases.  Knowing this information, or at least being familiar with it, will give you a leg-up.  Start with Paul Randal’s “Common SQL Server Myths” document.  This is a great summary of some of the information out there that is just plain wrong, and it will give you some good tools to start watching out for bad information that doesn’t fit into your paradigm.

Applying this type of thought to different areas of technology can help you start to think critically and proactively about the technologies that interest you most.  So, when I speak of fundamentals, I am not talking about features and syntax, I am speaking of the fundamentals of being a technologist.  This is a reminder that I myself need to come back to again and again, and I hope it can help someone else as well.

SQLskills Immersion Event 3 Wrap-Up

I am back home after my 3rd SQLskills Immersion Event. This has been a very full week of deep learning,  hundreds of questions, some fun networking, and intelligent (and sometimes sophomoric) discussions ranging from deep dives on SQL Engine internals to “the answer to life, the universe, and everything”.

If you have never attended a SQLskills Immersion Event and you are serious about SQL Server, let me please encourage you to sign up and attend, starting with IE1 (internals and performance). While there is no requirement to take the classes in order, I can attest to the value of doing so. Paul (Blog | Twitter) and Kimberly (Blog | Twitter) lay a very strong framework for everything else in that first week and then they, along with Jonathan (Blog | Twitter) and Joe (Blog | Twitter), build on that framework for the next two classes. On at least 20 different occasions, someone who had skipped the first class would ask a question or be confused on a point that had been explained in detail at the previous event. I understand that the names of IE2 or IE3 may sound more appealing to you, because of something you are facing in your current job, but some of the lessons in these classes are at the 400/500 level, you will benefit from having that deep understanding before progressing.

There were several highlights from this week that I would like to share. Restore strategies, including Kimberly’s USB drive Partial Database Availability and Online Piece-Meal Restore (she goes into more depth in the class than she did at her PASS Summit session). Paul went through a ton of corruption situations and the most productive steps to take to wade through the process of digging out of them, each and every one of them starting with, “Do Not Panic”. Jonathan showed us some great tips and lessons learned regarding Clustering Strategies (my favorite quote from that module: “I carry at least 5 separate clusters with me on my laptop at all times”, what a bragger! 🙂 ). And Joe, SQLskills’ latest addition, did a fantastic job representing the internals of replication and mirroring. There was a lot more, but it would be unreasonable to try to list it all here, just go check them out for yourself.

After these three weeks I feel like I understand what I do on a daily basis even better and have garnered great tips to make my environment more robust and prepare me for new challenges as they arise.
In addition to all of the great learning, I also had a great time hanging out with the SQLskills group (if you don’t know them, they are genuinely awesome people). I also met a lot of great people from the class. Almost all of the attendees are folks who are deeply interested in SQL top-to-bottom, but also have a wide range of other interests, and love to chat about everything from the LHC at CERN to TopGear. I got to know several of them quite well and they were a lot of fun to chat with and bounce ideas off of.

To wrap up, I would highly recommend the Immersion classes. Don’t worry about your skill-set going into IE1, if you are eager to learn, can focus on the classes, and work on applying the knowledge both at work and in the many labs they give you to work through, you will be able to follow everything they are talking about and drastically round out your SQL knowledge.

The Day Before SQLskills Immersion Event 3 – High Availability and Disaster Recovery

Tomorrow starts my third SQLskills Immersion Event.  I took the first two in Chicago (May & Oct 2011), and they were both phenomenal!  They are led by my friends, Paul Randal (Blog | Twitter), Kimberly Tripp (Blog | Twitter), Jonathan Kehayias (Blog | Twitter), and Joe Sack (Blog | Twitter).  So, we have all gathered here in Tampa, FL to improve our knowledge.

Each class has been chocked full of great knowledge with tons of very deep dives and little-known facts.  I am expecting IE3 to be no different.  HA and DR are a huge part of my job and I am always trying to improve on these aspects of my databases.  I am hoping to walk away from this class with new tips, tricks, and ideas to not only improve my current environment, but grow my understanding of SQL and improve my “craft” in general.  One thing is certain:  if you are at a SQLskills Immersion Event, you are serious about deepening your knowledge of SQL.

One extra-curricular note about SQLskills Immersion Events: when they say “Immersion” they mean it!  You have the option to spend a lot of time with your fellow students as well as the teachers.  I love connecting with others in the community, visiting with those I’ve met before, and meeting new people as well.   It is very refreshing to spend time with others who not only love SQL, but also love talking about a wide range of other topics; sometimes we even get into a little philosophy.  😀

I plan to post an update later in the week or at the end of the event.  Until then, I’ll be fully “immersed” in SQL knowledge!