Alternative to restoring to a point in TIME

“What are you smiling about?” my friend asked.
“I’m just giddy…” I replied.
“Why?”
“You won’t understand.”
“Try me.”
“Well, last night, I found out about fn_dump_dblog.”
“Uh-huh. So what are those? Shoes?”

Hooooo-kay! ;) Moving on…

With a transaction log backup, you gain the ability to restore your database to a point-in-time contained in the backup. For example, if a table is accidentally emptied via a DELETE operation that’s not within an explicit transaction, you can:

1) If possible, stop all database activities as soon as you learn about the accident then take a transaction log backup.

2) Restore the most recent full database backup and most recent differential backup (if any) to a new database (WITH NORECOVERY).

3) Apply your transaction logs up to the point prior to the delete.

4) When you have your deleted records back (in the new database), re-insert them to the table in the original database. Once you have inserted the data back, you would have to do some work to ensure that your database is how it really should be (i.e. if the DELETE that you’re trying to undo didn’t happen, what would the database state be?).

The tricky part really is figuring out up to what TIME to restore.

1) If you know the exact time, just apply each transaction log backup and specify STOPAT=TIME, WITH RECOVERY. As long as the time you specified is not met, the database will remain unrecovered. For example,

RESTORE LOG NewDB FROM DISK='C:BackupAdventureWorks.trn' WITH RECOVERY,
STOPAT = 'Aug 12, 2009 10:00:00 AM'
GO

2) If you know just the time range (for example, from 3:15PM-3:30PM), it’s a bit more tricky. You need to restore your log backups to different times (3:15PM, 3:16PM, 3:17PM, and so on) using STOPAT and you need to be able to look at the data each time before restoring more backups, and before changing the STOPAT time. SQL Server allows you to do this using an option called STANDBY. It allows you to recover your database, read it (you cannot modify), and restore more log backups. Check Tibor Karaszi’s Minimizing Data Loss for more information on how to do this.

The less you know about the time the accident happened, the more tedious the recovery gets. Imagine doing this if all you know is that the delete occurred yesterday and no time is specified…12 midnight…1am…2am…it’s not a good situation to be in.

Is there an alternative?
Yes. You can restore to a log sequence number instead–or to a point before it. You can look at restoring to an LSN as an alternative or a complement to the point-in-TIME restore. There’s some guesswork involved though. What you need are:

1) The name of the table
2) The number of records deleted (or at least, an estimate)
3) The time of the disaster/accident. You will see later that this is still a nice-to-have.

To illustrate, let’s delete all records in the AdventureWorks’s Production.TransactionHistory table. I’m using SQL Server 2005.

--delete all records
delete from Production.TransactionHistory

This deletes 113443 records.

Finding the LSN of the Point of Disaster
fn_dblog is a system UDF that’s undocumented by Microsoft. Information about it can be found online (yeah, it’s very well documented for something that’s supposedly undocumented). fn_dblog exposes to you some information about the database’s transaction log.

The plan is to use fn_dblog first to find the LSN of the point of disaster, and, if we don’t find it, to use fn_dump_dblog instead. fn_dump_dblog shows similar information as fn_dblog and more. fn_dump_dblog can read transaction log backups.

If the transaction that we’re looking for is not in the database’s transaction log, most likely it’s been overwritten and is already in a transaction log backup.

Reminder: Use fn_dblog, fn_dbdump_dblog, and other undocumented functions with caution, especially in production DBs. Actually, when I ran the scripts I used in this post on my development DB, they were extremely slow.

Also note that future versions or service packs may change how these functions behave. Make sure to test the scripts where you use them every time you make a change to your SQL Server installation.

1) Find the transaction’s Transaction ID. All the delete operations within the transaction will belong to one Transaction ID. We can find this using fn_dblog and the estimated/actual number of deletions.

USE AdventureWorks
GO
SELECT [Transaction ID], count(*)
FROM fn_dblog(DEFAULT, DEFAULT)
where AllocUnitName LIKE '%Production.TransactionHistory%'--table name
GROUP BY [Transaction ID]
HAVING COUNT(*) >= 113443 --(estimated/actual) number of deleted records

The result is this:

Transaction ID REC_COUNT
0000:0000130c 340329

(1 row(s) affected)

Deletions that are done on heaps will have a count that’s closer to the actual number of records deleted. The presence of indexes will result to a higher count but just keep in mind that the number of records deleted will be the minimum count.

If above query results to more than one Transaction ID, knowing the time (or even just a time range) that the delete happened really helps. We will discuss this later.

2) Next, get all the operations that have the Transaction ID we found in the previous step to verify if it is the transaction we are looking for.

SELECT
[Current LSN],
Operation,
Context,
[Transaction ID],
[AllocUnitId],
[AllocUnitName],
[UID],
[SPID],
[Begin Time],
[Transaction Name]
FROM fn_dblog(DEFAULT, DEFAULT)
WHERE [Transaction ID] = '0000:0000130c'
ORDER BY [Current LSN]

The results are as shown below:

Verifying the Results
The delete we performed will show as Operations = LOP_DELETE_ROWS in the transaction log. That’s what we see in (1). The table we deleted from has the following indexes:

and this coincides with (2).

The date in (3) is what we would have to use if we have several Transaction ID candidates and we’re not exactly sure which transaction to select. If we have the time of the accident (or at least, an estimate) we can select the transaction that has a [BEGIN TIME] closest to the time we have. This is the reason I said earlier that the date is nice to have.

So, based on the results, it looks like we have the correct Transaction ID.

4) Look for the LSN of the point of disaster.

This would be the first LSN of the transaction and the value of Operation should be LOP_BEGIN_XACT. In this example, it’s 00000026:00000458:0002. BINGO! This is our point of disaster. We want to restore not to this LSN but right before it.

Performing the Restore
Restore the most recent database backup and differential backup (if present) and apply the transaction log backups specifying the LSN where the restore should stop (right before).

--execute this after restoring the full database backup, the differential backup (if any), and the subsequent transaction log backups
RESTORE LOG [NewDB] FROM  DISK = N'C:BackupAdventureWorks.trn' --NewDB is the new database we're restoring to
   WITH STOPBEFOREMARK= 'lsn:00000026:00000458:0002' --this is the LSN of the beginning of the transaction which is our point of disaster
GO
RESTORE DATABASE NewDB
   WITH RECOVERY;

Once the restore is completed, extract the deleted data from the new db and re-insert them to the original table.

What if we can’t find the delete transaction using fn_dblog?
Turn on trace flag 2537 and try again. If you still can’t find it, a transaction log backup may have been performed already and the delete transaction in the log may have already been overwritten. If this is the case, then we need to do the same steps we did above but this time, we want to use the fn_dump_dblog command. I discovered fn_dump_dblog via…what else…GOOGLE! Yeah, finding it made me GIDDY and no, fn_dump_dblogain’t” shoes.

As I said earlier, it shows pretty much the same info as fn_dblog BUT it has the added ability to read transaction log backups. Because we no longer have the transactions in the database’s transaction log, we will attempt to read the transaction log backup file that was created right after the accident happened using fn_dump_dblog.

SELECT [Transaction ID], count(*)
FROM fn_dump_dblog(DEFAULT, DEFAULT,DEFAULT, DEFAULT, 'C:BackupAdventureWorks.trn', DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
GROUP BY [Transaction ID]
HAVING COUNT(*) >= 113443 --(estimated/actual) number of deleted records

In this test case, we only get one Transaction ID that meets the conditions we specified (it’s the same Transaction ID that we got earlier using fn_dblog). If there are more, we need to look at the operations of each Transaction ID and make an (educated) guess as to what Transaction ID corresponds to the accidental delete. The problem with fn_dump_dblog is for some undocumented reason, the AllocUnitName is empty. This makes it difficult to know what object/entity the DELETE operation was performed on. Regardless, there are still ways we can extract the information we need.

To check the operations of the Transaction ID that results from above code, we run the following:

SELECT
[Current LSN],
Operation,
Context,
[Transaction ID],
[AllocUnitId],
[AllocUnitName],
[UID],
[SPID],
[Begin Time],
[Transaction Name]
FROM fn_dump_dblog(DEFAULT, DEFAULT,DEFAULT, DEFAULT, 'C:BackupAdventureWorks.trn', DEFAULT,DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
WHERE [Transaction ID] = '0000:0000130c' --change this to WHERE [Transaction ID] IN ('0000:0000130c', ...) if there's more than 1 Transaction ID candidate
ORDER BY [Current LSN]

Here’s the result:

I have highlighted the fields you can look at to verify the LSN of the point of disaster. Pick the transaction that has a BEGIN TIME that’s closest to the time of accident. The time information here becomes more important since we don’t have the AllocUnitName. For this reason, I would recommend that you try to find the LSN while the transaction is still in the database’s log.

Tips

a) If you’re in full recovery model and have done a full database backup yet have never taken transaction log backups, you should be. :) If a minor accident like the example discussed in this post happens to you, you can still use above steps. I recommend executing the fn_dblog command first to get the LSN of the point of disaster and then perform a transaction log backup (you would need it to do a point-in-time restore.) And of course, going forward, continue doing transaction log backups.

b) Third-party transaction log reading tools can help you eliminate all the guesswork so also consider these options when trying to find the LSN of the point of disaster.

c) Going forward, secure your database to make sure users are not allowed to make these disastrous changes to the production database. Also consider using transaction log shipping and configure it such that the logs are restored to your destination only after some delay. This gives you quick access to an older state of your database should a disaster like the one discussed in this post happens. I’ll be discussing this more in detail in a future post. Tibor Karaszi’s Minimizing Data Loss also has some tips on avoiding this disaster in the future.

More Information
Recovering to a Log Sequence Number

Are we there yet?

Inspirational stories…don’t we just love them? Stories that give us hope and direction…like this one from @BrentO. It’s no Chicken Soup for the Soul but it gives you a glimpse of how it is and how it feels to get there.

“Get where?”, you ask. There. That place where dreams are.

It gets frustrating sometimes to hear ourselves saying…

I don’t know that.
I haven’t tried it.
I have no idea what you’re talking about.
I have never heard of it.
That sounds familiar but I’m not sure.
I’ve never really set one up.
I kinda know the concept but…I’d have to review how to do that.

Wouldn’t it be just nice to say…I know.

Reading inspirational stories give you that feeling of “Yes, I can do it too.” But then you go back to your daily grind, you look at your goals, and you realize, man, you still have a long way to go. And it’s frustrating.

I remember a conversation I had once with some family friends. I asked, “How do I make sure I don’t get caught for speeding?” One advised never to drive more than 5 mph over. Another said slow down if I see a suspiciously-cop-looking car. My brother-in-law said just always make sure the driver of the car in front of me is driving just as fast or faster.

Then finally, my 10-year old niece said, “Aunt Janice, if you drive the speed limit, I don’t think you’ll get caught.” And we all laughed.

But it’s not funny, isn’t it? Because truth is, there shouldn’t be a way “around” things. Getting “there” takes time. And just like driving, sometimes we all wish we had left earlier.

We all have moments like this…of self-doubt, of impatience, of thinking, “man, I wish I started this a long time ago!”.

When I told my bestfriend this, she said that my problem’s I keep focusing on how far I still have to go to get to my destination. She said that sometimes, I need to look back instead and see how far I’ve come. She said pause and give yourself some credit.

This is not an inspirational story for I don’t seek to inspire. If you’re reading this though and you get it, I just wanted you to know, you’re not alone.

Do I need to (re)do a full db backup to start a backup chain?

“There are no stupid questions”. I heard that when I was in first grade and I still believe that it’s true. Maybe rude, rhetorical, or unclear…but stupid? Nah.

I’ve been blessed to have people around me who have never made me feel that my questions were stupid. They have told me I’m dense though…so go figure. ;) Anyway–I know it doesn’t seem like I just wrote a disclaimer..but I did.

Last week, I was performing tests on transaction log backups. Everytime I restored a database backup of a database that was using a full recovery model to a new database, I would immediately do a full database backup to start the backup chain. (I needed to create transaction log backups for testing.) After doing this process 5 times, I thought–is this really necessary? That is, if I create a full database backup of a database that’s using the full recovery model, then restore it to a different db, do I really still need to make a full database backup *just* to start the backup chain? Can I just perform a transaction log backup without doing a full database backup? Afterall, I already have a full db backup. Will the database be in autotruncate mode right after the restore?

Note: Not familiar with recovery models? Read Recovery Model Overview and When is FULL Recovery not Really FULL Recovery.

What I kinda know
When a database is using the simple recovery model, it is in autotruncate mode. When you switch the recovery model from SIMPLE to FULL or BULK_LOGGED, it is important to perform a full database backup first to complete the change.  If you don’t, then the database will remain in autotruncate mode. Autotruncate simply means SQL Server is automatically truncating the database’s log when it gets almost full.

To check if a database is in autotruncate mode, I use below code:

--last_log_backup_ls is NULL if in autotruncate mode
select B.name, A.last_log_backup_lsn
from sys.database_recovery_status A inner join sys.databases B
on (A.database_id = B.database_id)

So to answer my questions…
I did some tests. First, I created a test database (TestDB), changed its recovery model to FULL, and performed a full database backup.

USE master
go
--Create the test database
CREATE DATABASE TestDB;
GO
--Set the recovery model to full
ALTER DATABASE TestDB SET RECOVERY FULL;
GO
--Full database backup
BACKUP DATABASE TestDB TO DISK = 'c:BackupTestDB_Full.bak';
GO

Right after I did  a full database backup, I checked and as expected, the database was no longer in autotruncate mode. (The last_log_backup_lsn was not null).

Next, I restored the full database backup to a different database (TestDB2).

RESTORE DATABASE [TestDB2] FROM  DISK = N'c:BackupTestDB_Full.bak' WITH RECOVERY,
      MOVE 'TestDB' TO
'C:TestDB2.mdf',
      MOVE 'TestDB_log'
TO 'C:TestDB2_log.ldf'
GO

Right after the restore completed, I checked if the TestDB2 database was in autotruncate mode:

--last_log_backup_ls is NULL if in autotruncate mode
select A.last_log_backup_lsn from sys.database_recovery_status A inner join sys.databases B on (A.database_id = B.database_id)
where B.name = 'TestDB2'

It wasn’t. The result was not NULL even if I haven’t performed any backup…

I checked if I could do a transaction log backup without doing a full database backup.

--Perform a transaction log backup
BACKUP LOG TestDB2 TO  DISK = N'C:BackupTestDB2.trn';
GO

Success! SQL Server allowed me to!

So, in summary,

1) Will the database be in autotruncate mode right after restore? No.
2) If we create a full database backup of a database in full recovery model, then restore that to a different db, can we just perform a transaction log backup without doing a full database backup? Yes.

Why?
I think it’s because backup chains are primarily dependent on continuous LSNs. According to this article, two backups are consecutive if the LastLSN of the first one is greater than  or equal to the FirstLSN of the next one. In our test…

RESTORE HEADERONLY
   FROM DISK = N'C:BackupTestDB_Full.bak'

FirstLSN: 49000000006100037
LastLSN:  49000000007900001

RESTORE HEADERONLY
   FROM DISK = N'C:BackupTestDB2.trn'

FirstLSN: 49000000007900001
LastLSN: 49000000009300001

they’re equal. And yup, I tried restoring and it worked. :) I repeated the test. This time though, right after I restored to a new database, I performed a full database backup before doing a transaction log backup.

USE master
go
--Create the test database
CREATE DATABASE AnotherTestDB;
GO

--Set the recovery model to full
ALTER DATABASE AnotherTestDB SET RECOVERY FULL;
GO

--Full database backup
BACKUP DATABASE AnotherTestDB TO DISK = 'c:BackupAnotherTestDB_Full.bak';
GO

RESTORE DATABASE [AnotherTestDB2] FROM  DISK = N'c:BackupAnotherTestDB_Full.bak' WITH RECOVERY,
      MOVE 'AnotherTestDB' TO
'C:AnotherTestDB2.mdf',
      MOVE 'AnotherTestDB_log'
TO 'C:AnotherTestDB2_log.ldf'
GO

--Do a Full database backup of the newly restored database
BACKUP DATABASE AnotherTestDB2 TO DISK = 'c:BackupAnotherTestDB2_Full.bak';
GO

--Perform a transaction log backup
BACKUP LOG AnotherTestDB2 TO  DISK = N'C:BackupAnotherTestDB2.trn';
GO

The generated LSNs actually meet our criterion for consecutive backups; see below:

--full database backup of original db
RESTORE HEADERONLY
   FROM DISK = N'C:BackupAnotherTestDB_Full.bak'
go

FirstLSN: 49000000006100037
LastLSN: 49000000007900001

--full database backup of restored db
RESTORE HEADERONLY
   FROM DISK = N'C:BackupAnotherTestDB2_Full.bak'
go

FirstLSN: 49000000009600043
LastLSN: 49000000011400001

--tlog backup of restored db
RESTORE HEADERONLY
   FROM DISK = N'C:BackupAnotherTestDB2.trn'

FirstLSN: 49000000007900001
LastLSN: 49000000012000001

And yup, I was able to restore the transaction log backup using each of the full db backups as base.

Imagine That!
If this happens in real-life, say you lose the full db backup that starts a backup chain (maybe someone deleted it or it was moved to a different planet), there may actually be a slim chance that you have another copy lurking somewhere. Keyword I think though is slim. Thoughts?

And, no, I still think there are no stupid questions.