16 Linux server monitoring commands you really need to know

If you want to know what’s going on with your Linux server, start with these Linux monitoring commands.

Want to know what’s really going on with your server? Then you need to know these essential commands. Once you’ve mastered them, you’ll be well on your way to being an expert Linux system administrator.

Sure, you can use a GUI program to pull up much of the information that these shell commands can give you, depending on the Linux distribution. SUSE Linux Enterprise Serverand openSUSE, for example, have an excellent graphical configuration and management tool, YaST. And there are universal tools, such as Webmin and cPanel, which can be used on any Linux server.

However, it’s a Linux administrator truism that you should run a GUI on a server only when you absolutely must. That’s because Linux GUIs take up system resources that could be better used elsewhere. So, while using a GUI program is fine for basic server health checkups, if you want to know what’s really happening, turn off the GUI and use these tools from the Linux command shell.

This also means you should start a GUI on a server only when it’s required; don’t leave it running. For optimum performance, a Linux server should run at runlevel 3, which fully supports networking and multiple users but doesn’t start the GUI when the machine boots. If you really need a graphical desktop, you can always get one by running startx from a shell prompt.

If your server starts by booting into a graphical desktop, you need to change this. To do so, head to a terminal window, su to the root user, and use your favorite editor on /etc/inittab.

Once there, find the initdefault line and change it from:

id:5:initdefault:

to

id:3:initdefault:

If there is no inittab file, create it and add the id:3 line. Save and exit. The next time you boot into your server, it will boot into runlevel 3. If you don’t want to reboot after this change, you can also set your server’s run level immediately with the command:

init 3

Once your server is running at init 3, you can start using the following shell programs to see what’s happening inside your server.

iostat

The iostat command shows in detail what your storage subsystem is up to. You usually use iostat to monitor how well your storage subsystems are working in general and to spot slow I/O problems before your clients notice that the server is running slowly. Trust me, you want to spot these problems before your users do!

meminfo and free

Meminfo gives you a detailed list of what’s going on in memory. Typically you access meminfo’s data by using another program such as cat or grep. For example,

cat /proc/meminfo

gives you the details of what’s going on in your server’s memory at any given moment.

For a quick “just the facts” look at memory, you can use the free command. In short, free gives you the overview; meminfo gives you the details.

mpstat

The mpstat command reports on the activities of each of the available CPUs on a multi-processor server. These days, thanks to multi-core processors, that’s almost all servers. Mpstat also reports on the average activities of all your server’s CPUs. It enables you to display overall CPU statistics per system or per processor. This overview can alert you to possible application problems before they get to the point of annoying users.

netstat

Netstat, like ps, is a Linux tool that administrators use every day. It displays a lot of network-related information, such as socket usage, routing, interface, protocol, network statistics, and more. Some of the most commonly used options are:

==================

-a Show all socket information -r Show routing information -i Show network interface statistics
-s Show network protocol statistics
==================

nmon

Nmon, short for Nigel’s Monitor, is a popular open-source tool to monitor Linux system performance. Nmon watches the performance information for several subsystems, such as processor utilization, memory utilization, run queue information, disk I/O statistics, network I/O statistics, paging activity, and process metrics. You can then view nmon’s real-time system measurements via its curses “graphical” interface.

To run nmon, you start the tool from the shell. Once up, you select the subsystems to monitor by typing in its one-key commands. For example, to get CPU, memory, and disk statistics, you type cm, and d. You can also use nmon with the -f flag to save performance statistics to a CSV file for later analysis.

For day-to-day server monitoring, I find nmon to be the single most useful program in my Linux system management toolkit.

pmap

The https://linux.die.net/man/1/pmap command reports the amount of memory that your server’s processes are using. You can use this tool to determine which processes on the server are being allocated memory and whether any of these processes are being piggy with RAM.

ps and pstree

The ps and pstree commands are two of the Linux administrator’s best friends. They both provide a list of all currently running processes. The ps command tells you how much memory and processor time the server’s programs are using; pstree shows less information but highlights which processes are the children of other processes. Armed with this information, you can spot out-of-control processes and kill them off with Linux’s “take no prisoners” kill command.

sar

The sar program is the Swiss Army knife of system monitoring tools. The sar command is actually made up of three separate programs: sar, which displays the data, and sa1and sa2, which collect and store it. Once installed, sar creates a detailed overview of CPU utilization, memory paging, network I/O and transfer statistics, process creation activity, and storage device activity. The big difference between sar and nmon is that the former is better at long-term system monitoring; I find nmon to be better at giving me a quick read on my server’s status.

strace

Strace is often thought of as a programmer’s debugging tool, but it’s more than that. It intercepts and records the system calls that are called by a process. This makes it a useful diagnostic, instructional, and debugging tool. For example, you can use strace to find out which configuration file a program is actually using when it starts up.

Strace does have one flaw, though. When it’s checking out a specific process, that process’ performance falls through the floor. Thus, I use strace only when I already have a darned good reason to think that that program is causing trouble.

tcpdump

Tcpdump is a simple, robust network monitoring utility. Its basic protocol analyzing capability enables you to get a rough view of what is happening on your network. To really dig into what’s going on with your network, however, you want to use Wireshark (see below).

top

The top command shows what’s going on with your active processes. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds. You can sort the processes by PID (Process ID); age, newest first; time, by cumulative time; and resident memory usage and total time it’s been using the CPU since startup. I find this a fast and easy way to see if any process is starting to lurch out of control and into trouble.

uptime

Use uptime to see how long the server has been running and how many users are logged on. It also gives you an overview of the average server load. The optimal value of the load is 1 or less, which means that each process has immediate access to the CPU and there are no CPU cycles lost.

vmstat

For the most part, you use vmstat to monitor what’s going on with virtual memory. Linux constantly uses virtual memory to get the best possible storage performance.

If your applications are taking up too much memory, you get excessive page-outs—programs moving from RAM to your system’s swap space, which is on the hard drive. Your server can reach a point where it’s spending more time managing memory paging than running your applications, a condition called thrashing. When your computer is thrashing, its performance falls through the floor. Vmstat, which can display either average data or actual samples, can help you spot memory pig programs and processes before they bring your server to a crawl.

Wireshark

Wireshark, formerly known as Ethereal (and still often referred to that way), is tcpdump’s big brother, though more sophisticated and with far more advanced protocol analysis and reporting. Wireshark has both a GUI interface and a shell interface. If you do any serious network administration, you must use ethereal. And, if you’re using Wireshark/ethereal, I highly recommend Chris Sander’s Practical Packet Analysis, a great book on how to get the most out of this useful program.

This is only a 10,000-foot overview of some of Linux’s most valuable system monitoring programs. Still, by mastering these programs you’ll have taken a first step toward Linux system administrator proficiency.