Creating a Performance Monitor Black Box

One thing that we always set up on a new server is a Performance Monitor “black box”. The black box is basically a PerfMon collector that starts when the server does and runs continuously on the background, gathering performance data from number of vital counters. It has minimal performance overhead and is great at giving you a good idea on what has been going on in your servers during the last several days.

What to collect?

This really depends on what you consider to be vital, personally I collect information about the disks (both logical and physical), memory, processor, processes and network. The things that I consider vital are latencies, throughput, IOPS, memory and CPU usage as well as possible connectivity errors and number of connections made. You might have noticed that there aren’t any SQL Server counters listed and there’s a reason for that.

I like to keep the list of counters as compact as possible because the more data you collect the shorter the time period you can cover will be. Also the  counters mentioned above will, most of the time, tell you if there have been performance issues. And while they might not give you the exact cause (you’ll need more specific monitoring for that) it’ll give you a starting point because you can see, for example, if it’s related to CPU or storage and you can pin it to a process.

Setting it up

Setting it up is rather simple. First you should create a folder where to save results. If it’s a cluster, use a local disk on a node so it’s available all the time. I usually create one called “PerfMon” or something similar. Then create a text-file that has the counters you’re interested in listed in it. For example, if you want to monitor disk latencies, IOPS and Queue lengths you’ll add this.

\LogicalDisk(*)\Avg. Disk sec/Read
\LogicalDisk(*)\Avg. Disk sec/Write
\LogicalDisk(*)\Current Disk Queue Length
\LogicalDisk(*)\Disk Transfers/sec
\PhysicalDisk(*)\Avg. Disk sec/Read
\PhysicalDisk(*)\Avg. Disk sec/Write
\PhysicalDisk(*)\Current Disk Queue Length
\PhysicalDisk(*)\Disk Transfers/sec

If you’re not sure what counters are available, you can get that information very easily from Windows by running the following command in command prompt:

TYPEPERF -Q >> all_counters.txt

The list of all the counters on my laptop is about 2300 rows, from servers you’ll find bit more counters than that. After you have them in a text-file, it’s just a task of copy-pasting to get them to your Black Box counter file. Once you have the text file set up it’s time to create the collector, it’s done by running the following command on an elevated command prompt.

LOGMAN CREATE COUNTER BlackBox -cf blackbox_counters.txt -si 01:00 -f bincirc -o "C:\PerfMon\BlackBox_%computername%" --v -max 500

The command above creates a collector called BlackBox using the counters from the text file. It’s run every 1 minutes and the results are saved to circular binary file (so when it reaches the size of 500MB it starts writing data from the beginning). Notice that there are double-dashes before “v” parameter, this is needed to remove the versioning info from the file name. Without it, every time the collector starts it’ll create a new file called BlackBox_%computername%_datestamp.BLG, thus potentially filling your hard drive with BLG-files.

After you create the BlackBox it’ll be stopped by default, you can check this with following command:

LOGMAN QUERY

This will display all the collectors and their status, like this:

BlackBox_stopped

To start the BlackBox you can simply run the following command:

LOGMAN START BlackBox

Running the query command again, you should see BlackBox with “Running” status. And that’s it, you’re all set.

BlackBox_start

Making sure that it’s started after reboot

Now that you’ve set up the BlackBox there’s one final step you should take to make most out of it. That step is to make sure that the BlackBox also gets started after you or someone else decides to reboot the server. There’s actually a simple way to accomplish this with Windows Task Scheduler. Once you open the Task Scheduler navigate to following folder Task Scheduler Library\Microsoft\Windows\PLA, this is short for Performance Logs and Alerts and it is where PerfMon collectors, among other things are saved.

PLA_folder

Open the properties of the BlackBox and go to the tab that’s named Triggers. In here, choose New… and when the New Trigger dialog opens, select At startup on the Begin the task: dropdown menu. Now every time the server is restarted BlackBox will also get started.

BlackBox_startup_schedule

After pressing OK, go to Settings tab. I’ve noticed that in some servers there’s a tick in the box that’s titled: Stop the task if it runs longer than: with default value of 3 days. If it’s enabled, remove it, because you most likely don’t want your collector to stop after 3 days.

BlackBox_settings

Setting this up only takes couple minutes, but those are minutes well spend when, for example, someone tells you on Monday that during the weekend something happened and everything was slow…

Author: Mika Sutinen

Hi, My name is Mika Sutinen and I'm a Senior Database Administrator for a company called Tieto. I've been working in IT-industry for two decades and I've spend most of my career working with healthcare information systems. I've worked with SQL Server for most of my career, starting with version 6.5 a long, long time ago. My other interests are high availability, everything related to performance (testing, monitoring, etc), Windows operating systems and I'm currently learning more about Azure.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s