Unit 12 Lab - Baselines & Benchmarks
Info
If you are unable to finish the lab in the ProLUG lab environment we ask you reboot the machine from the command line so that other students will have the intended environment.
Resources / Important Links
Required Materials
- Rocky 9.4+ - ProLUG Lab
- Or comparable Linux box
- root or sudo command access
Downloads
The lab has been provided for convenience below:
Pre-Lab Warm-Up
-
Create a working directory
-
Verify if
iostatis availableIf it’s not there: -
Verify if
stressis availableIf it’s not there: -
Verify if
iperf3is availableIf it’s not there:
Lab 🧪
Baseline Information Gathering
The purpose of a baseline is not to find fault, load, or to take corrective action. A baseline simply determines what is. You must know what is so that you can test against that when you make a change to be able to objectively say there was or wasn't an improvement. You must know where you are at to be able to properly plan where you are going. A poor baseline assessment, because of inflated numbers or inaccurate testing, does a disservice to the rest of your project. You must accurately draw the first line and understand your system's performance.
Using SAR (CPU and memory statistics)
Some useful sar tracking commands. 10 minute increments.
For your later labs, you need to collect sar data in real time to compare with the
baseline data.
Using IOSTAT (CPU and device statistics)
iostat will give you either processing or device statistics for your system.
Using iperf3 (network speed testing)
In the ProLUG lab, red1 is the iperf3 server, so we can bounce connections off it (192.168.200.101).
Using STRESS to generate load
stress will produce extra load on a system. It can run against proc, ram, and disk I/O.
Read the usage output and try to figure out what each option does.
Developing a Test Plan
The company has decided we are going to add a new agent to all machines. Management has given this directive to you because of PCI compliance standards with no regard for what it may do to the system. You want to validate if there are any problems and be able to express your concerns as an engineer, if there are actual issues. No one cares what you think, they care what you can show, or prove.
Determine the right question to ask
-
Do we have a system baseline to compare against?
- No? Make a baseline.
- No? Make a baseline.
-
Can we say that this system is not under heavy load?
-
What does a system under no load look like performing tasks in our environment?
- Assuming our systems are not running under load, capture SAR and baseline stats.
-
Perform some basic tasks and get their completion times.
-
Writing/deleting 3000 empty files #modify as needed for your system
-
Testing processor speed
-
Alternate processor speed test
This takes random numbers in blocks, zips them, and then throws them away.
Tune to about ~10 seconds as needed.
-
-
What is the difference between systems under load with and without the agent?
Run a load test (with stress) of what the agent is going to do against the system.
While the load test is running, do your same functions and see if they perform differently.
Execute the plan and gather data
Edit these as you see fit, add columns or rows to increase understanding of system performance. This is your chance to test and record these things.
System Baseline Tests
| Metric | Server 1 |
|---|---|
| SAR average load (past week) | |
| IOSTAT test (10 min) | |
| IOSTAT test (2s x 10 samples) | |
| Disk write - small files | |
| Disk write - small files (retry) | |
| Disk write - large files | |
| Processor benchmark |
You may baseline more than once, more data is rarely bad.
Make 3 different assumptions for how load may look on your system with the agent and design your stress commands around them (examples):
-
I assume no load on hdd, light load on processors
-
I assume low load on hdd, light load on processors
-
I just assume everything is high load and it's a mess
In one window start your load tests (YOU MUST REMEMBER TO STOP THESE AFTER YOU GATHER
YOUR DATA).
In another window gather your data again, exactly as you did for your baseline with
sar and iostat just for the time of the test.
System Tests while under significant load
Put command you're using for load here:
| Metric | Server 1 |
|---|---|
| SAR average load (during test) | |
| IOSTAT test (10 min) | |
| IOSTAT test (2s x 10 samples) | |
| Disk write - small files | |
| Disk write - small files (retry) | |
| Disk write - large files | |
| Processor benchmark |
System Tests while under significant load
Put command you're using for load here:
| Metric | Server 1 |
|---|---|
| SAR average load (during test) | |
| IOSTAT test (10 min) | |
| IOSTAT test (2s x 10 samples) | |
| Disk write - small files | |
| Disk write - small files (retry) | |
| Disk write - large files | |
| Processor benchmark |
Continue copying and pasting tables as needed.
Reflection Questions (optional)
- How did the system perform under load compared to your baseline?
- What would you report to your management team regarding the new agent’s impact?
- How would you adjust your test plan to capture additional performance metrics?
Info
Be sure to reboot the lab machine from the command line when you are done.