Friday, February 24, 2012

ACS Internals

Database:

The ACS DB is primarily made up of daily partition tables.... we create a new one every day during the nightly maintenance, which defaults to 1AM.  We create a new partition, then close the previous one.  Then, we kick off a reindex of the previous day's table for reporting performance.
To view all this.... first, lets have a look at the dtconfig table:
Select * from dtconfig
image
The only thing we would change in this table is the "number of partitions".  This value is essentially the number of partitions to keep, or number of days worth of data we will retain in the ACS DB.  The default is 14, and you need to adjust this based on your retention requirements and DB sizing capability.
Next, lets check out the dtpartition table:
select * from dtpartition order by partitionstarttime


image
Essentially.... these are your daily partitions.  At any given time... you should have one partition with a status of "0" and the rest should be status of "2".  "2" means they are ready to be groomed.   Pre-SP1, if the online indexing of a closed partition failed during the nightly maintenance, we would leave them in a status of "1".  This is bad, because they would never groom, and fill the database if this kept up.  If you have any that are left in a status = 1, then just run this in SQL:
Use OperationsManagerAC
UPDATE dtPartition
SET Status = 2
WHERE Status = 1
This will fix the grooming issue, and the tables will groom at the next maintenance interval.  This is a very common issue prior to SP1.


UPDATE 6/3/08
Even in SP1 - occasionally people have reported issues where some partitions are left in a status of "1" and these partitions never groom.  If left unchecked, this can eventually fill the database/database volume.  Microsoft has released an internal hotfix that you can request from PSS if you feel you are impacted by this.  I have seen this happen in many large environments.  Request hotfix/KB 949969


Ok - enough on grooming.  On to bigger and better things.

Audit Collector

The audit collector really does all the work in ACS.  It keeps track of the forwarders (agents), maintains the queue, filters the data, and then writes to the ACS DB.
Lets first talk about the basics.  The audit collector runs a service, "Adtserver" which is running Adtserver.exe from the %systemroot%\system32\security\Adtserver directory.
Speaking of that directory - there is a lot of cool stuff in there!  Also present, are the .SQL files... which are called during maintenance.....  the primary ones to look at are DbCreatepartition.sql, DbClosepartition.sql, and DbDeletePartition.sql.  Pretty self explanatory.... these run to create new partition tables, close and reindex the previous day's table, and then to delete old tables that are ready to be groomed out.  These are called from the audit collector to the ACS database, and should not be run manually.
Also present in this directory is a little gem of a file, by the name of AcsConfig.XML.  This file has a list of ALL the audit forwarders ever known to the collector, and their last contact time, and sequence number of the last event they have sent to the collector.  You can copy this out - open it with Excel, and see all the data in a very readable format.  This data is kept in memory on the collector, and updates the file every 5 minutes.
Probably the biggest problem I see in an ACS environment, is just lack of proper sizing.  The Perf and Scale guide has really good guidance here for ACS, and should be followed:  http://download.microsoft.com/download/d/3/6/d3633fa3-ce15-4071-be51-5e036a36f965/OM2007_PerfScal.doc
One of the best things you can do is to apply a proper filter on the collector.  By default, ACS will collect and store every single event in the security event logs from forwarders.  This is good and bad.  Good - because you are getting everything.  Bad - because "everything" doesn't help you.  A large amount of the events logged in the security logs, are not very useful... depending on how draconian your audit policy is.  You really want to just collect the security events that are needed to meet your audit and security compliance requirements.  A couple good resources:
http://www.microsoft.com/technet/security/guidance/auditingandmonitoring/securitymonitoring/default.mspx
http://www.securevantage.com/Products/2007%20Solutions/Docs/ACS%20Guides/Secure%20Vantage%20ACS%20Noise%20Filter%20Guide.pdf
Here is a good, basic filter, to remove a lot of what most consider "not good info":
SELECT * FROM AdtsEvent WHERE NOT (((EventId=528 AND String01='5') OR (EventId=576 AND (String01='SeChangeNotifyPrivilege' OR HeaderDomain='NT Authority')) OR (EventId=538 OR EventId=566 OR EventId=672 OR EventId=680)))
How do you apply a filter???  Well, I am glad you asked!  We will run adtadmin.  Here is a link to all the parameters:
http://technet.microsoft.com/en-us/library/bb309436.aspx
To examine the current filter, open a command prompt on the collector... and lets run a command in the %systemroot%\system32\security\Adtserver directory:    adtadmin -getquery
That will show you what you are currently filtering.  The default is "select * from AdtsEvent"  which is no filtering.  To use the filter posted above.... run the following:
adtadmin /setquery /collector:"collectorname" /query:"SELECT * FROM AdtsEvent WHERE NOT (((EventId=528 AND String01='5') OR (EventId=576 AND (String01='SeChangeNotifyPrivilege' OR HeaderDomain='NT Authority')) OR (EventId=538 OR EventId=566 OR EventId=672 OR EventId=680)))"