Friday, October 15, 2010

MONITRING DMZ AND WORKGROUP COMPUTER WITH SCOM 2007 R2 USING CERTIFICATES (ERRORS 21007 AND 21016 AFTER APPROVING THE AGENT IN PENNDING MANGMENT)

A new guide to help you monitor servers in your dmz or a workgroup with system center operation manger
Well there might be a few guides like this around the web and I have used most of them,
But for the past 3 mounts I have been battling with this scenario where the agent would stay in "not monitor" state after been approved in the pending management pane and the agent had 21007 and 21016 events on the operations manger event log on the workgroup / dmz server  I wanted to monitor

If you have a working gateway and after your approve the agents in pending mode and used to momcertimport with successful results and you still receive event id's like21007 and 21016 on the workgroup / DMZ agent this guide is for you.
Well my solution is available for you here
Well first of all and very basic (but not for me) I have 2003 enterprise ca server so I used this guide to create my certificate template
I flowed that guide to the letter and still those event id's and no communication to my gateway.
 Something was missing,
The first change I noticed was that I now I had no option to save a certificate to local computer certificate store this of course is because of the server 2008 enrolment  pages that would need administrator right witch the internet explorer does not use  
So in order to export the certificate to a file I had to use internet explorer
There under tools -> internet options -> content 
There is a certificates section.
Click the certificate button and you can export your certificate from there
Remember to export the private key after clicking the next batten leave this mark
Don’t mark include all certificates it the certification path if possible
The momcertimport tool will not be able to import the certificate

We will deal with the root ca needed in the workgroup / DMZ server in a minute


Then you can save your certificate to a pfx file and copy it to the server you want to monitor
Keep it in a shared folder for the duration of the install process because you will need it for the gateway server as well as the workgroup / DMZ server.

One more certificate is needed before we can continue and again I used this guide
http://technet.microsoft.com/en-us/library/bb735413.aspx  I used the section called "To download the Trusted Root (CA) certificate"

Notice that you might not be able to get to the web site of your ca server form the workgroup computer so you can do that from your root management server and just save it in the folder were you saved your ca for the workgroup / DMZ server you wanted to monitor

And on one last note before we begin: while most guide say the certificate subject a.k.a the name filed is fqdn don’t just push your domain name in the computer name.
cheek before logon to the workgroup / DMZ server  and Go to start -> computer -> properties – check the full computer name and copy the exact name to your gateway host file if no dns resolution is available

NOW FOR THE STEP BY STEP GUIDE

1.    PREPERING TO INSTALL  THE AGENT ON THE WORKGROUP MECHINE 

 I recommend  you copy this folders  form your scom CD  to one folder you can move around in your environment, let's call that our "scomdmz" inside you will need this folders
* SupportTools
* agent
I recommend  you copy this files to that same folder
* server_cert.pfx (certificate you created using a template for your workgroup / DMZ server)

*
CA_certificate_chain.p7b (for the trusted Root (CA) certificate)
move this file to your workgroup machine  (keep a copy of your
server_cert.pfx
to copy to your gateway server later  )

2.     INSTALLING THE AGENT ON THE WORKGROUP MECHINE

run the msi installation on your server  if there is no dns resolution for your gateway server ping –a the ip address to see if you get the name of your gateway server,  if not you will need to add your gateway server fqdn name to your host file – it's in c:\windows\system32\drivers\etc


(we use the example in our org…)

I KNOW THIS IS A VERY BASIC STUFF RIGHT HERE – I want this guide to be able to apply even to those who don’t deal with this in a daily manner

 now this to prevent any human typing Mistake
write the fqdn gateway server in the host file copy & paste it to the management computer name I recommend also copy & paste to command line and telnet the computer name to your gateway on 5723 to check connectivity.
Click next your almost home free…

3.     IMPORTING THE CERTIFICATES TO YOUR GATEWAY AND SERVER

THIS WILL BE SPLIT IN TO TWO PARTS

A.  IMPORTING THE CERTIFICATES ON YOUR DMZ SERVER YOU WANT TO MONITOR -

 using the momcertimport tool  
-on the
workgroup / DMZ server go to start -> if 2008 type cmd if 2003 go to run type cmd
one thing very imported cheek -  if you're on server 2008 check to see if your command prompt run with administrator rights (if not right click the icon before you press enter and  run it as administrator)



the tool is in the
SupportTools folder (the one we copied earlier if you flowed step one)
so! The way to run this tool is simple get to it in the command prompt and the give the server certificate file like so
c:\dmzfolder\
SupportTools\i386\momcertimport  server_cert.pfx type the password for the key and you will need to receive successfully  state message

YOU GOT THIS FAR – you stop and started the health service like asked in the momcertimport tool after imported the certificate  and still receive those 21007 and 21016 events  you will need to fallow this few steps

 What you need now is another certificate to be imported.
    1.     Go to start mmc -> file -> add/remove snap-in…
    2.     Add certificates add computer account, click next choose local                                            computer click ok and exit – it's all you need for the console

    3.     Go to the
Trusted Root Certification Authorities folder on the folder Certificates right click all tasks -> import…
and import your
 CA_certificate_chain.p7b we prepared in step 1 this guide
and import it to the
Trusted Root Certification Authorities folder
the folder contains certificates that in most time already be in there
but don’t skip this stage.



B. IMPORTING THE CERTIFICATES ON TO YOU GATEWAY SERVER – again this is for all of you battling with  error id  21037 on your gateway (and of course any kind of lack of communication between the agent and your gateway server )

    1.     Go to start mmc -> file -> add/remove snap-in…
    2.     Add certificates add computer account, click next choose local                                            computer click ok and exit – it's all you need for the console
    3. Go to the
Trusted Root Certification Authorities and import your
server_cert.pfx we talked about in step one to that folder
    3. Go to Personal folder and import it to that folder ass well



note: we are importing the certificates of the server that we want to monitor into our gateway  Trusted Root Certification Authorities and to the personal folder



4.     CHEKING THE COUMNICATION -

after all the certificates have been imported to  the gateway server and to our soon to be monitored server, in order for this changes to take affect well have to do the fallowing steps

restart health service known as system center management on your gateway
restart health service known as system center management on your  root management server
restart health service known as system center management on your dmz server


Check your DMZ server event viewer to see if the error id repeats
Some changes take time you might want to wait 5-10 minutes after 10
Minutes you need restart the health service again on your DMZ server and cheek your event viewer for the id's if still receive restart the health service again on your root management server and your gateway server