Sunday, October 10, 2010

Operations Database is full!


Let’s say you find yourself in a pickle. 
Perhaps you ignored your Operations Database size, perhaps grooming was failing and you didn’t notice, perhaps you wrote a BAD rule, and FLOODED the database with events, or performance data?
Now, your database is full, and there is no more free space on the disk?

What if you want to get rid of the data RIGHT NOW?

We can run grooming manually.  I discuss a bit about the inner-workings of the grooming process HERE.  We can execute grooming by opening SQL Management Studio, and opening a query window against the OpsDB – and running the grooming procedure “EXEC p_PartitioningAndGrooming”.

You will either get a success – or a failure.  If this fails, it is typically because the transaction log is full, before the job can complete.  If you need more transaction log space, this means you need to groom a LARGE amount of non-partitioned objects.

Data types:  The most common data types we insert (and have to groom) in the OpsDB are:
  • Alerts
  • Events
  • Performance
  • Performance Signature
  • Discovery data
Let’s talk about partitioned, and non-partitioned data types.  Events and Performance data in the OpsDB are partitioned.  All the others aren not partitioned.  There are 61 tables to store Events and 61 tables to store Performance data in the operations DB.  Each table represents 1 days worth of storage.  This is done to assist in grooming.  Since there can be a HUGE amount of event and performance data, we groom these by truncating a daily table, which is FAR more efficient than using a “delete from tablename where date > xx/yy/zz”.  Truncating a table uses almost no transaction log space or time, while “delete from” uses a bunch.
When we groom partitioned data, the first thing we do is truncate the next table in the list, then change the “IsCurrent” marker to the newly empty table.  You can look at this “map” in the PartitionTables table in the database.
To see which tables we are currently writing to – check out:
select * from PartitionTables
where IsCurrent = 1
So – IF our opsDB is flooded with data – and we just need to clear up some space to work with…. a way to cheat, is to run the standard grooming stored procedure 62 times.  This will force a truncate of all partitioned data in the database.
So we would run:  EXEC p_PartitioningAndGrooming in the SQL query window, 62 times.  You can track the progress by running the “IsCurrent” query check above.  This will wipe out all the partitioned data, and free up a ton of space in your DB really quickly.
For non-partitioned data – there are no shortcuts… you have to groom this the old fashioned way, and wait for it to complete.  Once your DB is healthy again – this will go back to being a quick and painless process.