Friday, May 15, 2015

Change Data Capture (CDC) and Simple Recovery Mode

One of the misconceptions that some people have about change data capture (CDC) in SQL Server is that it needs the database to be set up with full recovery mode. If we look at the following TechNet article, it mentioned:

It should be noted that change data capture works with all recovery models. But when change data capture is enabled, operations that would normally be minimally logged in simple or bulk-logged recovery models are fully logged to enable change data capture to capture all changes.

So Change Data Capture can be enabled on database with simple recovery mode. This can be helpful as typically in some development environment, the databases are being set with simple recovery mode. One thing to be careful is that if we have a database with simple recovery mode and have CDC enabled, but it doesn’t have the CDC capture job running, as the database process transactions, the database’s transaction log would use more space. This is because those transactions will not get release until the CDC capture process those transactions. Let me try to demonstrate this.

Database in Simple Recovery Mode without CDC

As a base case, let us set up a database with simple recovery mode.

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

CREATE DATABASE CdcTest
ON (
Name = 'CdcTest',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest.mdf',
Size = 10 MB,
FileGrowth = 5 MB)
LOG ON (
Name = 'CdcTest_Log',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest_Log.ldf',
Size = 50 MB,
FileGrowth = 5 MB);

ALTER DATABASE CdcTest
SET RECOVERY SIMPLE;
GO

Now, let us create a table called “TableWithoutCDC”.

USE [CdcTest];
GO

CREATE TABLE dbo.TableWithoutCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));

Now, try to insert some data into the table. Basically, the script below would do a one row insert in a transaction. It would do it 5,000 times.

USE [CdcTest];
GO

DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithoutCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END

Now, try to get the database’s transaction log space usage:

USE [CdcTest];
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

We got the following result:

image

The database’s transaction log file is using 0.8 MB of disk space.

Database in Simple Recovery Mode with CDC

Now, let see the impact of CDC, with the CDC capture job not running, on database's transaction log. First, let us reset the CdcTest database.

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

CREATE DATABASE CdcTest
ON (
Name = 'CdcTest',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest.mdf',
Size = 10 MB,
FileGrowth = 5 MB)
LOG ON (
Name = 'CdcTest_Log',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest_Log.ldf',
Size = 50 MB,
FileGrowth = 5 MB);

ALTER DATABASE CdcTest
SET RECOVERY SIMPLE;
GO

Then, create a table called "TableWithCDC".

USE [CdcTest];
GO

CREATE TABLE dbo.TableWithCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));

Now, we can try to enable CDC on CdcTest database and enable it to capture changes on the TableWithCDC.

USE [CdcTest];
GO

EXEC sp_cdc_enable_db;
EXEC sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'TableWithCDC', @capture_instance = 'dbo_TableWithCDC', @supports_net_changes = 1, @role_name = NULL;

By default, when CDC is enabled, it would create 2 SQL Agent jobs, one is to capture the changes and another one is to clean up the CDC data older than 3 days (default, but it can be modify). For this testing, if you have SQL Agent running, you might want to stop and disabled the CDC capture job. You can do that using the following script:

USE [msdb];
GO

EXEC dbo.sp_stop_job @job_name = 'cdc.CdcTest_capture';
EXEC dbo.sp_update_job @job_name = 'cdc.CdcTest_capture', @enabled = 0;

Now, similar to the first test, try to have some transactions in the database.

USE [CdcTest];
GO

DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END

Now, let we see how it impacts the database's transaction log usage.

USE [CdcTest];
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database's transaction log is now using about 116 MB of disk space, much higher than the 0.8 MB of disk space in the first test. If we look at the log_reuse_wait_desc column on sys.databases table for CdcTest database:

SELECT  [name], log_reuse_wait_desc
FROM sys.databases
WHERE [name] = 'CdcTest'

The reason that the database could not reuse the log space is because "replication", which actually make sense since under the hood CDC is using the same technology as replication. Basically, the database will not release the transactions in the transaction log until they are processed by the CDC capture job/process.

Now, let's try to process the CDC changes, do a checkpoint and check the database's transaction log space usage:

USE [CdcTest];
GO

EXEC sp_cdc_scan;
GO

CHECKPOINT;
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database's transaction log disk usage immediate drops to about 4 MB.

Additional thing to note is that, on the above example, we did those transactions on a table that is being tracked with CDC. This would not happen if the transactions are run against any table that is not being tracked by CDC on CDC enabled database. The database transaction log will reuse the log space after a checkpoint. The following script can help to illustrate this:

USE [CdcTest];
GO

--- Create a table that is not being tracked by CDC within CDC enabled database.
CREATE TABLE dbo.TableWithoutCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));
GO

--- Run transactions against the TableWithoutCDC
DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithoutCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END
GO

--- Now get the database transaction log disk usage
SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database will report that the transaction log is using around 2 MB of space (the number might vary but it should be a small number).

Clean Up

You can use the following script to clean:

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

Take Away

Some take away points:

  • CDC can be enabled on database with any recovery models, including database with simple recovery mode.
  • Make sure that the CDC capture job is running, as transactions against CDC tracked table(s) will be preserved in the transaction log until they are processed by CDC capture job/process. This is especially true if you restore CDC enabled database with KEEP_CDC option, as the CDC capture and clean up jobs would not get created by default
  • If you have CDC tracked tables and you don't have the CDC capture job running, you can potentially see your database's transaction log grow even if it is in a simple recovery mode.

Tuesday, May 5, 2015

SQL Script: Database Auto Grow and/or Auto Shrink Report

Below is a SQL script that can be used to list recent auto grow and/or auto shrink events that happened on your SQL Server. It queries the SQL Server default trace. By default, default trace is enabled (you can check this by running: sp_configure 'default trace enabled').

DECLARE @TracePath NVARCHAR(1000);

-- Get the file path for the default trace
SELECT @TracePath =
REVERSE(SUBSTRING(REVERSE([path]),
CHARINDEX('\', REVERSE([path])), 1000)) + 'log.trc'
FROM sys.traces
WHERE is_default = 1;

-- Query to get auto growth and shrink event from the default trace
SELECT EventName = te.[name],
tr.ServerName,
tr.DatabaseName,
tr.[FileName],
FilePath = sf.[filename],
Duration_in_ms = tr.Duration/1000,
FileGrowth_in_mb = (tr.IntegerData*8)/1024,
tr.StartTime,
tr.EndTime
FROM sys.fn_trace_gettable(@TracePath, DEFAULT) tr
INNER JOIN sys.trace_events te
ON tr.EventClass = te.trace_event_id
INNER JOIN sys.sysaltfiles sf
ON tr.[FileName] = sf.[name]
WHERE te.name in (
'Data File Auto Grow',
'Log File Auto Grow',
'Data File Auto Shrink',
'Log File Auto Shrink')
ORDER BY tr.StartTime DESC;

By the way, you can also get the information on recent auto grow and/or auto shrink events in SQL Server Management Studio, by doing the following:

  • Open SQL Server Management Studio (SSMS)
  • On the Object Explorer (you can press F8 to view Object Explorer), right click on the database that you want to get the auto grow and/or auto shrink events on
  • On the context menu, select Reports and then Standard Reports. Then select Disk Usage.
  • It would open a Disk Usage report for that particular database. If there is any recent auto grow and/or autoshrink event on that database, you should be able to see the “Data/Log Files Autogrow/Autoshrink Events” section.

Data/Log Files Autogrow/Autoshrink Events

Hopefully, you don’t have the auto shrink feature enabled on any of your database. Thus, you don’t have any auto shrink events to worry about.

As for auto grow events, if you notice a lot of them, you might want to consider pre-grow (increase the size of) the data/log file during period of low activity. Another thing that you might consider if you are noticing a lot of auto grow events in your database is that you can make the auto growth size to be bigger.

If you experience frequent auto grow events in your database log file, it can lead to high Virtual Log Files (VLF) count.

Wednesday, April 29, 2015

PowerShell Script: List SQL Server Analysis Service Role Permission By Database and Cube

A while ago, I wrote a couple of blog entries in regards to using PowerShell scripts in order to list roles within the database and their members. The script can be found here and here.

Let’s try to expand on those scripts further.

List Role Permission For Each Analysis Service Database

If we look at the GUI, on the database level, we can assign the role to have one or more of the following permissions: full control, process database and read definition.

SSAS Role Properties

We can use the following PowerShell script to list role permission for each database:

[Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")

# SSAS server name variable
$SSASServerName = "ServerName"

# Try to connect to the SSAS server
$SSASServer = New-Object Microsoft.AnalysisServices.Server
$SSASServer.Connect($SSASServerName)

# Object to store the result
$Result = @()

# Get the SSAS databases and loop thru each of them
foreach ($DB in $SSASServer.Databases)
{
# Get the SSAS database
$SSASDatabase = $SSASServer.Databases.Item($DB.name)

foreach ($Role in $SSASDatabase.Roles)
{
$DatabasePermission = $DB.DatabasePermissions.GetByRole($Role.ID)

$ItemResult = New-Object System.Object
$ItemResult | Add-Member -type NoteProperty -name DatabaseName -value $DB.Name
$ItemResult | Add-Member -type NoteProperty -name RoleName -value $Role.Name
$ItemResult | Add-Member -type NoteProperty -name DatabaseAdministrator -value $DatabasePermission.Administer
$ItemResult | Add-Member -type NoteProperty -name DatabaseProcess -value $DatabasePermission.Process
$ItemResult | Add-Member -type NoteProperty -name DatabaseReadDefinition -value $DatabasePermission.ReadDefinition


$Result +=$ItemResult
}

}

$Result | Select DatabaseName, RoleName, DatabaseAdministrator, DatabaseProcess, DatabaseReadDefinition | format-table * -Wrap -AutoSize | Out-String

List Role Permission For Each Cube within the Analysis Service Database

We can also define role permission for each cube within the Analysis Service database. If we look at the GUI, the basic role permissions for each cube would look like:

image

We can use the following PowerShell script to iterate the permission of each role for each cube within each database:

[Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")

# SSAS server name variable
$SSASServerName = "ServerName"

# Try to connect to the SSAS server
$SSASServer = New-Object Microsoft.AnalysisServices.Server
$SSASServer.Connect($SSASServerName)

# Object to store the result
$Result = @()

# Get the SSAS databases and loop thru each of them
foreach ($DB in $SSASServer.Databases)
{
# Get the SSAS database
$SSASDatabase = $SSASServer.Databases.Item($DB.name)

foreach ($Role in $SSASDatabase.Roles)
{

# Get the SSAS cubes within the database
foreach ($Cube in $DB.Cubes)
{
$CubePermission = $Cube.CubePermissions.GetByRole($Role.ID)

$ItemResult = New-Object System.Object
$ItemResult | Add-Member -type NoteProperty -name DatabaseName -value $DB.Name
$ItemResult | Add-Member -type NoteProperty -name RoleName -value $Role.Name
$ItemResult | Add-Member -type NoteProperty -name CubeName -value $Cube.Name
$ItemResult | Add-Member -type NoteProperty -name CubeRead -value $CubePermission.Read.value__
$ItemResult | Add-Member -type NoteProperty -name CubeWrite -value $CubePermission.Write.value__
$ItemResult | Add-Member -type NoteProperty -name CubeProcess -value $CubePermission.Process
$ItemResult | Add-Member -type NoteProperty -name CubeReadDefinition_Local -value $CubePermission.ReadDefinition.value__
$ItemResult | Add-Member -type NoteProperty -name CubeReadSourceData_Drillthrough -value $CubePermission.ReadSourceData.value__

$Result +=$ItemResult
}
}
}

$Result | Select DatabaseName, RoleName, CubeName, CubeRead, CubeWrite, CubeProcess, CubeReadDefinition_Local, CubeReadSourceData_Drillthrough| format-table * -Wrap -AutoSize | Out-String

Combining the Scripts

As a bonus, below is the PowerShell script that can be used to list the role permissions for the cubes and databases within the SQL Server Analysis Services:

[Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")

$FormatEnumerationLimit = -1

# SSAS server name variable
$SSASServerName = "ServerName"

# Try to connect to the SSAS server
$SSASServer = New-Object Microsoft.AnalysisServices.Server
$SSASServer.Connect($SSASServerName)

# Object to store the result
$Result = @()

# Get the SSAS databases and loop thru each of them
foreach ($DB in $SSASServer.Databases)
{
# Get the SSAS database
$SSASDatabase = $SSASServer.Databases.Item($DB.name)

foreach ($Role in $SSASDatabase.Roles)
{
$DatabasePermission = $DB.DatabasePermissions.GetByRole($Role.ID)

# Get the SSAS cubes within the database
foreach ($Cube in $DB.Cubes)
{
$CubePermission = $Cube.CubePermissions.GetByRole($Role.ID)

$ItemResult = New-Object System.Object
$ItemResult | Add-Member -type NoteProperty -name DatabaseName -value $DB.Name
$ItemResult | Add-Member -type NoteProperty -name RoleName -value $Role.Name
$ItemResult | Add-Member -type NoteProperty -name DatabaseAdministrator -value $DatabasePermission.Administer
$ItemResult | Add-Member -type NoteProperty -name DatabaseProcess -value $DatabasePermission.Process
$ItemResult | Add-Member -type NoteProperty -name DatabaseReadDefinition -value $DatabasePermission.ReadDefinition
$ItemResult | Add-Member -type NoteProperty -name CubeName -value $Cube.Name
$ItemResult | Add-Member -type NoteProperty -name CubeRead -value $CubePermission.Read.value__
$ItemResult | Add-Member -type NoteProperty -name CubeWrite -value $CubePermission.Write.value__
$ItemResult | Add-Member -type NoteProperty -name CubeProcess -value $CubePermission.Process
$ItemResult | Add-Member -type NoteProperty -name CubeReadDefinition_Local -value $CubePermission.ReadDefinition.value__
$ItemResult | Add-Member -type NoteProperty -name CubeReadSourceData_Drillthrough -value $CubePermission.ReadSourceData.value__
$ItemResult | Add-Member -type NoteProperty -name RoleMembers -value ($Role.Members | Select -ExpandProperty Name)

$Result +=$ItemResult
}
}

}

$Result | Select DatabaseName, RoleName, DatabaseAdministrator, DatabaseProcess, DatabaseReadDefinition, `
CubeName, CubeRead, CubeWrite, CubeProcess, CubeReadDefinition_Local, CubeReadSourceData_Drillthrough, RoleMembers | format-table * -Wrap -AutoSize | Out-String

Wednesday, April 22, 2015

PowerShell Script: Adding Database into Always On Group

Every once in a while, I have  a new database that I need to put into an existing Always On group. So I have decided to create a PowerShell script that would do just that. The script can be found here.

The script accepts the following parameters:

  • DBName – The name of the database that we need to put into Always On Group.
  • DAGName – The name of the Always On group
  • ReplicaNode – Either the primary replica or one of the secondary replica of the Always On Group
  • BackupLocation – Network share location in which each of the replica node can read or write the database backup

Assumptions

  • The database is located in the primary replica node of the Always On group.
  • The Always On group has already been created, but the database has not been joined to the group.

Script Logic

In summary, the script would do the following:

  • Information gathering
    • Find Always On primary replica
    • Find Always On secondary replica(s)
  • Run some validations
    • Check to see if the database exists
    • Database is not one of the system database (master, model, tempdb or msdb)
    • Database is online
    • Database is in full recovery mode
    • Database is currently not in any Always On group
  • Backup the database (full and transaction log) from the primary replica into the backup location
  • Restore the database using the full and transaction log backup onto each of the secondary replica
  • Add the database into Always On group on the primary replica and each of the secondary replica

Tuesday, April 14, 2015

Stale View

If you have non-schema bound view in SQL Server and the underlying table was updated, you potentially can have a stale view. A quick way to demonstrate this:

Let say that you have the following table:

CREATE TABLE dbo.Contact (
Id INT IDENTITY(1,1) PRIMARY KEY,
FirstName VARCHAR(10),
LastName VARCHAR(10));
GO

INSERT INTO dbo.Contact VALUES ('John', 'Doe');
INSERT INTO dbo.Contact VALUES ('Jane', 'Doe');

Then, create a simple view against the newly created table:

CREATE VIEW dbo.vwContact
AS
SELECT *
FROM dbo.Contact
GO

If we run the following query:

SELECT Id, FirstName, LastName
FROM dbo.vwContact

We would get the expected result:

image

Now, what happen if we add a new column to the Contact table:

ALTER TABLE dbo.Contact
ADD Country VARCHAR(10) NULL;
GO

UPDATE dbo.Contact
SET Country = 'USA'
WHERE Id = 1;

If we run the following query against the vwContact view:

SELECT    Id, FirstName, LastName, Country
FROM dbo.vwContact;

We would get the following error message:


Msg 207, Level 16, State 1, Line 28

Invalid column name 'Country'.


That’s odd, it looks like the view does not recognize the newly added Country column.

Ok, let say that your company got a new contact, his name is Mr. Rhino Hippopotamus. Obviously, you will need to have a bigger last name column. So you decide to change the LastName column from varchar(10) to varchar(20).

ALTER TABLE dbo.Contact
ALTER COLUMN LastName VARCHAR(20);
GO

INSERT INTO dbo.Contact VALUES ('Rhino', 'Hippopotamus', 'USA');
GO

If you try to query the view using the following query (let’s forget about the Country column for now):

SELECT Id, FirstName, LastName
FROM dbo.vwContact;

That seems to work just fine. You get the expected result:

image

But, what happen if you try to run the same query from another server, using a linked server:

SELECT Id, FirstName, LastName
FROM ServerA.TestDB.dbo.vwContact;

Instead of getting the expected result, we got the following error:


Msg 7347, Level 16, State 1, Line 1

OLE DB provider 'SQLNCLI11' for linked server 'HSQL01' returned data that does not match expected data length for column '[ServerA].[TestDB].[dbo].[vwContact].LastName'. The (maximum) expected data length is 10, while the returned data length is 12.


If we check on the column property for the LastName column of vwContact view from the server that has the linked server to the original server, using the following query, we would notice that somehow, the LastName column is still listed as having 10 characters long:

USE [master];
GO

EXEC sp_columns_ex 'ServerA', 'vwContact', 'dbo', 'TestDB', 'LastName', NULL;

Solution

So how to address the issue that we are having with stale view. There are several ways:

We could drop and recreate the view. In the above example, we can run the following query:

IF (OBJECT_ID('dbo.vwContact') IS NOT NULL)
DROP VIEW dbo.vwContact
GO

CREATE VIEW dbo.vwContact
AS
SELECT *
FROM dbo.Contact
GO

We could also run an alter view statement:

ALTER VIEW dbo.vwContact
AS
SELECT *
FROM dbo.Contact
GO

Alternatively, we could refresh the view using the following script:

EXEC sp_refreshview 'dbo.vwContact';