Sunday, October 10, 2010

Moving the Data Warehouse Database and Reporting server to new hardware

The time has come to move my Warehouse Database and OpsMgr Reporting Server role to a new server in my lab.  Today – both roles are installed on a single server (named OMDW).  This server is running Windows Server 2008 SP2 x86, and SQL 2008 SP1 DB engine and SQL Reporting (32bit to match the OS).  This machine is OLD, and only has 2GB of memory, so it is time to move it to a 64bit capable machine with 8GB of RAM.  The old server was really limited by the available memory, even for testing in a small lab.  As I do a lot of demo’s in this lab – I need reports to be a bit snappier.
The server it will be moving to is running Server 2008 R2 (64bit only) and SQL 2008 SP1 (x64).  Since Operations Manager 2007 R2 does not yet support SQL 2008R2 at the time of this writing – we will stick with the same SQL version.


We will be using the OpsMgr doco – from the Administrators Guide:
http://technet.microsoft.com/en-us/library/cc540402.aspx

So – I map out my plan. 
  1. I will move the warehouse database.
  2. I will test everything to ensure it is functional and working as hoped.
  3. I will move the OpsMgr Reporting role.
  4. I will test everything to ensure it is functional and working as hoped.

Move the Data Warehouse DB:
Using the TechNet documentation, I look at the high level plan:
  1. Stop Microsoft System Center Operations Manager 2007 services to prevent updates to the OperationsManagerDW database during the move.
  2. Back up the OperationsManagerDW database to preserve the data that Operations Manager has already collected from the management group.
  3. Uninstall the current Data Warehouse component, and delete the OperationsManagerDW database.
  4. Install the Reporting Data Warehouse component on the new Data Warehouse server.
  5. Restore the original OperationsManagerDW database.
  6. Configure Operations Manager to use the OperationsManagerDW database on the new Data Warehouse server.
  7. Restart Operations Manager services.

Sounds easy enough.  (gulp)

  • I start with step 1 – stopping all RMS and MS core services.
  • I then take a fresh backup of the DW DB and master.  This is probably one of the most painful steps – as on a large warehouse – this can be a LONG time to wait while my whole management group is down.
  • I then uninstall the DW component from the old server (OMDW) per the guide.
  • I then (gasp) delete the existing OperationsManagerDW database.
  • I install the DW component on the new server (SQLDW1).
  • I delete the newly created and empty OperationsManagerDW database from SQLDW1.
  • I then need to restore the backup I just recently took of the warehouse DB to my new server.  The guide doesn’t give any guidance on these procedures – this is a SQL operations and you would use standard SQL backup/restore procedures here – nothing OpsMgr specific.  I am not a SQL guy – but I figure this out fairly easily.
  • Next up is step 8 in the online guide – “On the new Data Warehouse server, use SQL Management Studio to create a login for the System Center Data Access Service account, the Data Warehouse Action Account, and the Data Reader Account.”  Now – that’s a little bogus documentation.  The first one is simple enough – that is the “SDK” account that we used when we installed OpsMgr.  The second one though – that isnt a real account.  When we installed Reporting – we were asked for two accounts – the "reader” and “write” accounts.  The above referenced Data Warehouse Action Account is really your “write” account.  If you aren't sure – then there is a Run-As profile for this that you can see what credentials you used.
  • I then map my logins I created to the appropriate rights they should have per the guide.  Actually – since I created the logins with the same names – mine were already mapped!
  • I start the Data Access (SDK) service ONLY on the RMS
  • I modify the reporting server data warehouse main datasource in reporting.
  • I edit the registry on the current Reporting server (OMDW) and have to create a new registry value for DWDBInstance per the guide – since it did not exist on my server yet.  I fill it in with “SQLDW1\I01” since that is my servername\instancename
  • I edit my table in the OpsDB to point to the new Warehouse DB servername\instance
  • I edit my table in the DWDB to point to the new Warehouse DB servername\instance
  • I start up all my services.
Now – I follow the guidance in the guide to check to make sure the move is a success.  Lots of issues can break this – missing a step, misconfiguring SQL rights, firewalls, etc.  When I checked mine – it was actually failing.  Reports would run – but lots of failed events on the RMS and management servers.  Turns out I accidentally missed a step – editing the DW DB table for the new name.  Once I put that in and bounced all the services again – all was well and working fine.

Now – on to moving the OpsMgr Reporting role!

Using the TechNet documentation, I look at the high level plan:
  1. Back up the OperationsManagerDW database.
  2. Note the accounts that are being used for the Data Warehouse Action Account and for the Data Warehouse Report Deployment Account. You will need to use the same accounts later, when you reinstall the Operations Manager reporting server.
  3. Uninstall the current Operations Manager reporting server component.
  4. Restore the original OperationsManagerDW database.
  5. If you are reinstalling the Operations Manager reporting server component on the original server, run the ResetSRS.exe tool to clean up and prepare the reporting server for the reinstallation.
  6. Reinstall the Operations Manager reporting server component.

Hey – even fewer steps than moving the database! 
***A special note – if you have authored/uploaded CUSTOM REPORTS that are not deployed/included within a management pack – these will be LOST when you follow these steps.  Make sure you export any custom reports to RDL file format FIRST, so you can bring those back into your new reporting server.

  • I back up my DataWarehouse database.  This step isn't just precautionary – it is REQUIRED.  When we uninstall the reporting server from the old server – it modifies the Warehouse DB in such a way that we cannot use – and must return it to the original state before we modified anything – in preparation for the new installation of OpsMgr Reporting on the new server.
  • Once I confirm a successful backup, I uninstall OpsMgr R2 Reporting from my old reporting server.
  • Now I restore my backup of the OperationsManagerDW database I just took prior to the uninstall of OpsMgr reporting.  My initial attempts at a restore failed – because the database was in use.  I needed to kill the connections to this database which were stuck from the RMS and MS servers.
  • I am installing OpsMgr reporting on a new server, so I can skip step 4.
  • In steps 5-10, I confirm that my SQL reporting server is configured and ready to roll.  Ideally – this should have already been done BEFORE we took down reporting in the environment.  This really is a bug in the guide – you should do this FIRST – BEFORE event starting down this road.  If something was broken, we don’t want to be fixing it while reporting is down for all our users.
  • In step 11, I kick of the Reporting server role install.  Another bug in the guide found:  they tell us to configure the DataWarehouse component to “this component will not be available”  That is incorrect.  That would ONLY be the case if we were moving the OpsMgr reporting server to a stand alone SRS?Reporting ONLY server.  In my case – I am moving reporting to a server that contains the DataWarehouse component – so this should be left alone.  I then chose my SQL server name\instance, and type in the DataWarehouse write and reader accounts.  SUCCESS!!!!
Now – I follow the guide and verify that reporting is working as designed.
Mine (of course) was failing – I got the following error when trying to run a report:

Date: 8/24/2010 5:49:27 PM
Application: System Center Operations Manager 2007 R2
Application Version: 6.1.7221.0
Severity: Error
Message: Loading reporting hierarchy failed.
System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.10.10.12:80
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)
   at System.Net.HttpWebRequest.GetRequestStream()
   at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Reporting.ReportingService.ReportingService2005.ListChildren(String Item, Boolean Recursive)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Reporting.ManagementGroupReportFolder.GetSubfolders(Boolean includeHidden)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Reporting.WunderBar.ReportingPage.LoadReportingSubtree(TreeNode node, ManagementGroupReportFolder folder)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Reporting.WunderBar.ReportingPage.LoadReportingTree(ManagementGroupReportFolder folder)
   at Microsoft.EnterpriseManagement.Mom.Internal.UI.Reporting.WunderBar.ReportingPage.LoadReportingTreeJob(Object sender, ConsoleJobEventArgs args)
System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.10.10.12:80
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)

The key area of this is highlighted in yellow above.  I forgot to open a rule in my Windows Firewall on the reporting server to allow access to port 80 for web reporting.  DOH!
Now – over the next hour – I should see all my reports from all my MP’s trickle back into the reporting server and console.