Tuesday, December 12, 2017

T-SQL Tuesday #97 - 2018 Learning Goals

This month T-SQL Tuesday is hosted by Malathi Mahadevan (blog | twitter) and it is about 2018 learning goals. Given that we are at the end of 2017, this is a timely topic. As data professionals and technology enthusiasts, we are living in an interesting time. There are a lot of opportunities to learn, so much so that we might feel overwhelmed by them. Goals can certainly help to focus our learning efforts.

Learning Goals

In 2018, I am planning to learn more on architect-ing and managing cloud solution. As I gain experience working with databases in the cloud, I feel that it would also be beneficial to learn about the different cloud components/services and how they can complement each other.

Ways to Learn

To help me focus, I am planning to get Amazon AWS and/or Microsoft Azure certification. Normally, Amazon and/or Microsoft have a guideline (or blueprint) that will tell you the things/skills that you need to know to pass the certification exam. That guideline (or blueprint) can serve as an outline of topics that we can use to learn. To learn the different topics I am planning to read published whitepapers, FAQ articles and blogs. Given the evolving nature of cloud technologies, blogs can help to keep us updated with the latest information or features.

I am also planning to learn by taking online courses. I enjoy doing online courses. It allows me to study at my own time and pace. There are some Amazon AWS and Microsoft Azure online courses available on Udemy and edx that I am planning to take.

Finally, I am also a believer in learning by doing. When learning new concept/technology, I like to experiments and have hands-on experience. The good thing is that both Amazon and Microsoft offer us opportunities for us to have hands-on experience with most of their cloud offering for free for 12-months. For Amazon, we can sign up for AWS free tier. For Microsoft, we can sign up here.

Friday, December 8, 2017

Backup Analysis Services Database Using PowerShell

PowerShell

We can use PowerShell to backup Analysis Services (AS) multidimensional or tabular database. This is great because it offers us more flexibility in automating the AS database backup process. Typically, we might have multiple Analysis Service databases within an Analysis Services instance. As a good Database Administrator responsible for the Analysis Services, we want to backup all those Analysis Services databases.

The PowerShell script below can be used to backup all databases within an Analysis Services instance. To accomplish this, we use Analysis Management Objects (AMO) to get the list of databases available within the Analysis Services instance. Then we use the Backup-ASDatabase cmdlet to backup each database to Analysis Services backup (.abf) file. Alternatively, we could use the AMO backup method to backup the Analysis Services database, but for this script, I choose to use the Backup-ASDatabase cmdlet.

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")
$ServerName = "ServerName"
$BackupPath = "C:\Temp"

$Dt = Get-Date -f yyyyMMddhhmmss

# Make sure that the backup folder exist.
$BackupLocation = "$BackupPath\$ServerName"
If (!(Test-Path $BackupLocation))
{
New-Item -ItemType Directory -Force -Path $BackupLocation
}

$Server = New-Object Microsoft.AnalysisServices.Server
$Server.Connect($ServerName)

# Loop thru each database within the SSAS server.
foreach ($Database in $Server.Databases)
{
$DatabaseName = $Database.Name
$BackupFile = $BackupLocation + "\" + $DatabaseName + "_" + $dt + ".abf"

Write-Host "Starting to backup $DatabaseName ..." -ForegroundColor Green
Backup-ASDatabase -Server $ServerName -BackupFile $BackupFile -Name $DatabaseName -ApplyCompression -ErrorAction Stop
Write-Host "$DatabaseName has been backed up successfully." -ForegroundColor Green
}

Reference

  • Backup-ASDatabase cmdlet - The site contains information on how to use the Backup-ASDatabase cmdlet. It also provides you with the different parameters that you can use. You can get the information by running the "get-help Backup-ASDatabase -detailed" on your PowerShell window.

Wednesday, July 22, 2015

Remove Log Shipping When Secondary Server Not Available

The other day, I came accross a scenario where I need to remove log shipping from a database on the primary server. However, we have already shutdown/decommissioned the secondary server. If we try to disable the log shipping from the database properties window, we would most like get the following:

Database Properties

Error Deleting Log Shipping Configuration

image

We would need to remove it manually using the following T-SQL script:

USE [master]
GO

EXEC sp_delete_log_shipping_primary_secondary @primary_database = '[Database_Name]', @secondary_server = '[Secondary_Server_Name]', @secondary_database = '[Database_Name]';
GO

EXEC sp_delete_log_shipping_primary_database @database = '[Database_Name]'
GO

References:


Monday, July 20, 2015

Reflection on Steve Stedman's Database Corruption Challenge

Over the past few months, Steve Stedman (blog | twitter) has been conducting database corruption challenge. It is a challenge where Steve would post corrupt SQL Server databases and the participants would try to fix the corruption to the best of their ability. The participants would need to avoid any data loss when trying to fix the database corruption.

It has been an interesting challenge. It is a good way for us to practice dealing with different scenario of database corruptions. Each week, Steve was able to come out with different set of database corruption problems to solve. For example, in one week, we are able to recover corrupt data by simply querying the data against the non-clustered index. In another week, we would need to get the missing data from SQL Server data page.

After each challenge, Steve would post a couple of correct solutions sent by the participants. Often times, by reading those solutions, I learned different methods of resolving database corruption (or retrieving corrupt data) than the one that I have. The old saying of "There's more than one way to skin a cat" definitely holds true in many cases.

In a perfect world, we would not have any corruption in any of our database. However, we don't live in a perfect world. By working thru the database corruption challenge, I become more prepared in troubleshooting database corruption. I would like to thank Steve for making contribution to the SQL community by conducting the Database Corruption Challenge. Hopefully he would do more of this challenges in the future.

Additional Resources

While working on the database corruption challenge, some of the resources that I found useful and/or interesting are:

  • Paul Randal's "Corruption Demo Databases and Scripts". Paul Randal is one of the expert subject matter when it comes to database corruption. He has several databases corruption demo and scripts that we can download. It can be found here.
  • Mark Rasmussen article on "Deciphering SQL Server Data Page". It explains in some details on how to read the output of DBCC Page. The article can be found here.
  • Steve Stedman's presentation of "When Database Corruption Strikes" to the PASS DBA Virtual Chapter. It can be found here.
  • Kendra Little's "Corrupting Databases for Dummies-Hex Editor Editor". It provides some information on using Hex Editor to corrupt database. The article can be found here.

Friday, May 15, 2015

Change Data Capture (CDC) and Simple Recovery Mode

One of the misconceptions that some people have about change data capture (CDC) in SQL Server is that it needs the database to be set up with full recovery mode. If we look at the following TechNet article, it mentioned:

It should be noted that change data capture works with all recovery models. But when change data capture is enabled, operations that would normally be minimally logged in simple or bulk-logged recovery models are fully logged to enable change data capture to capture all changes.

So Change Data Capture can be enabled on database with simple recovery mode. This can be helpful as typically in some development environment, the databases are being set with simple recovery mode. One thing to be careful is that if we have a database with simple recovery mode and have CDC enabled, but it doesn’t have the CDC capture job running, as the database process transactions, the database’s transaction log would use more space. This is because those transactions will not get release until the CDC capture process those transactions. Let me try to demonstrate this.

Database in Simple Recovery Mode without CDC

As a base case, let us set up a database with simple recovery mode.

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

CREATE DATABASE CdcTest
ON (
Name = 'CdcTest',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest.mdf',
Size = 10 MB,
FileGrowth = 5 MB)
LOG ON (
Name = 'CdcTest_Log',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest_Log.ldf',
Size = 50 MB,
FileGrowth = 5 MB);

ALTER DATABASE CdcTest
SET RECOVERY SIMPLE;
GO

Now, let us create a table called “TableWithoutCDC”.

USE [CdcTest];
GO

CREATE TABLE dbo.TableWithoutCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));

Now, try to insert some data into the table. Basically, the script below would do a one row insert in a transaction. It would do it 5,000 times.

USE [CdcTest];
GO

DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithoutCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END

Now, try to get the database’s transaction log space usage:

USE [CdcTest];
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

We got the following result:

image

The database’s transaction log file is using 0.8 MB of disk space.

Database in Simple Recovery Mode with CDC

Now, let see the impact of CDC, with the CDC capture job not running, on database's transaction log. First, let us reset the CdcTest database.

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

CREATE DATABASE CdcTest
ON (
Name = 'CdcTest',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest.mdf',
Size = 10 MB,
FileGrowth = 5 MB)
LOG ON (
Name = 'CdcTest_Log',
FileName = 'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\CdcTest_Log.ldf',
Size = 50 MB,
FileGrowth = 5 MB);

ALTER DATABASE CdcTest
SET RECOVERY SIMPLE;
GO

Then, create a table called "TableWithCDC".

USE [CdcTest];
GO

CREATE TABLE dbo.TableWithCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));

Now, we can try to enable CDC on CdcTest database and enable it to capture changes on the TableWithCDC.

USE [CdcTest];
GO

EXEC sp_cdc_enable_db;
EXEC sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'TableWithCDC', @capture_instance = 'dbo_TableWithCDC', @supports_net_changes = 1, @role_name = NULL;

By default, when CDC is enabled, it would create 2 SQL Agent jobs, one is to capture the changes and another one is to clean up the CDC data older than 3 days (default, but it can be modify). For this testing, if you have SQL Agent running, you might want to stop and disabled the CDC capture job. You can do that using the following script:

USE [msdb];
GO

EXEC dbo.sp_stop_job @job_name = 'cdc.CdcTest_capture';
EXEC dbo.sp_update_job @job_name = 'cdc.CdcTest_capture', @enabled = 0;

Now, similar to the first test, try to have some transactions in the database.

USE [CdcTest];
GO

DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END

Now, let we see how it impacts the database's transaction log usage.

USE [CdcTest];
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database's transaction log is now using about 116 MB of disk space, much higher than the 0.8 MB of disk space in the first test. If we look at the log_reuse_wait_desc column on sys.databases table for CdcTest database:

SELECT  [name], log_reuse_wait_desc
FROM sys.databases
WHERE [name] = 'CdcTest'

The reason that the database could not reuse the log space is because "replication", which actually make sense since under the hood CDC is using the same technology as replication. Basically, the database will not release the transactions in the transaction log until they are processed by the CDC capture job/process.

Now, let's try to process the CDC changes, do a checkpoint and check the database's transaction log space usage:

USE [CdcTest];
GO

EXEC sp_cdc_scan;
GO

CHECKPOINT;
GO

SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database's transaction log disk usage immediate drops to about 4 MB.

Additional thing to note is that, on the above example, we did those transactions on a table that is being tracked with CDC. This would not happen if the transactions are run against any table that is not being tracked by CDC on CDC enabled database. The database transaction log will reuse the log space after a checkpoint. The following script can help to illustrate this:

USE [CdcTest];
GO

--- Create a table that is not being tracked by CDC within CDC enabled database.
CREATE TABLE dbo.TableWithoutCDC (
Id INT PRIMARY KEY IDENTITY(1,1),
Field1 VARCHAR(4000),
Field2 VARCHAR(4000),
Field3 VARCHAR(4000),
Field4 VARCHAR(4000),
Field5 VARCHAR(4000));
GO

--- Run transactions against the TableWithoutCDC
DECLARE @z AS INT;
SET @z = 1;

WHILE (@z <= 5)
BEGIN
DECLARE @a AS INT;
SET @a = 1

WHILE (@a <= 1000)
BEGIN
DECLARE @Filler AS VARCHAR(4000);

SET @Filler = REPLICATE('a', 4000);
BEGIN TRANSACTION
INSERT INTO dbo.TableWithoutCDC
VALUES (@Filler, @Filler, @Filler, @Filler, @Filler)
COMMIT TRANSACTION
SET @a = @a + 1
END

CHECKPOINT;

SET @z = @z + 1
END
GO

--- Now get the database transaction log disk usage
SELECT Name,
[FileName],
file_in_MB = (CAST(size AS FLOAT) * 8)/ 1024,
file_used_in_MB = (CAST(FILEPROPERTY(name, 'SpaceUsed') AS FLOAT) * 8)/ 1024
FROM sys.sysfiles

The database will report that the transaction log is using around 2 MB of space (the number might vary but it should be a small number).

Clean Up

You can use the following script to clean:

USE [master];
GO

IF EXISTS(SELECT 1 FROM sys.databases WHERE [name] = N'CdcTest')
BEGIN
DROP DATABASE CdcTest;
END
GO

Take Away

Some take away points:

  • CDC can be enabled on database with any recovery models, including database with simple recovery mode.
  • Make sure that the CDC capture job is running, as transactions against CDC tracked table(s) will be preserved in the transaction log until they are processed by CDC capture job/process. This is especially true if you restore CDC enabled database with KEEP_CDC option, as the CDC capture and clean up jobs would not get created by default
  • If you have CDC tracked tables and you don't have the CDC capture job running, you can potentially see your database's transaction log grow even if it is in a simple recovery mode.