{"id":10,"date":"2007-12-24T04:02:29","date_gmt":"2007-12-24T08:02:29","guid":{"rendered":"http:\/\/www.microworkshop.com\/WordPress\/?p=10"},"modified":"2012-12-01T19:44:10","modified_gmt":"2012-12-02T00:44:10","slug":"linux-unix-monitoring-the-operating-system-memory-cpu-hard-drive-performance-and-other-resources","status":"publish","type":"post","link":"https:\/\/microdevsys.com\/wp\/linux-unix-monitoring-the-operating-system-memory-cpu-hard-drive-performance-and-other-resources\/","title":{"rendered":"Linux \/ UNIX: Monitoring the operating system memory, cpu, hard drive, performance and other resources."},"content":{"rendered":"<p>\n\tOne of the more powerfull features of any Linux\/Unix system are the monitoring features and capabilities of the OS. <strong>UNIX<\/strong> systems in general have been around for decades and in that time the tools available to users and admins alike on both systems have grown tramendously. In fact, the log files, tools and applications under <strong>UNIX \/ Linux<\/strong> allow you to really drill down into the heart of the operating system and diagnose a problem with the potential to even fix the issue yourself with the right know how or if the problem isn&#39;t really complicated.&nbsp; Further, the open source community has grown tramendously recently so finding support online for errors you find is no longer as difficult as it originally was.&nbsp; And support is no longer esoteric to developers either. &nbsp;\ud83d\ude42\n<\/p>\n<p>\n\tIn this case we will look at all the associated Linux commands that can help you identify, debug and possibly even fix potential problems:\n<\/p>\n<p>\n\t<!--more--><br \/>\n\tHere is a list of some of the more useful tools and a brief introduction to using them:\n<\/p>\n<ol>\n<li>\n\t\tLog files under <strong>&lsquo;\/var\/log&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tNetwork monitoring <strong>&lsquo;netstat&rsquo;<\/strong> command.\n\t<\/li>\n<li>\n\t\tProcess table lookup using <strong>&lsquo;ps&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tRealtime resource monitoring using <strong>&lsquo;top&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tSimple system health checker using <strong>&lsquo;uptime&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tDisk information and settings using <strong>&lsquo;hdparm&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tDisk temperature checking using <strong>&lsquo;hddtemp&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tDisk SMART monitoring using <strong>&rsquo;smartctl&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tMemory checking using <strong>&lsquo;free&rsquo;<\/strong> &amp; <strong>&lsquo;cat \/proc\/meminfo&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tView process tree using <strong>&lsquo;pstree&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tUNIX \/ Linux concept of <em>Load<\/em>.\n\t<\/li>\n<li>\n\t\tSystem \/ boot messages using <strong>&lsquo;dmesg&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tSystem Activity Reporting using <strong>&rsquo;sar&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tLinux \/ UNIX NIC (Network Card) statistics using <strong>&lsquo;ifconfig&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tProcessor statistics using <strong>&lsquo;mpstat&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tDisk statistics using <strong>&lsquo;iostat&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tMemory statistics using <strong>&lsquo;vmstat&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tList Open Files using <strong>&lsquo;lsof&rsquo;<\/strong>\n\t<\/li>\n<li>\n\t\tLinux \/ UNIX Program <strong>Core<\/strong> (Dump) files.\n\t<\/li>\n<li>\n\t\t<strong>EXAMPLE!<\/strong>\n\t<\/li>\n<\/ol>\n<p>\n\tGenerally <strong>&lsquo;\/var\/log\/&rsquo;<\/strong> is where Linux\/UNIX systems store log files. This is especially true for old applications, especially written by Linux\/UNIX core developers. In recent years, however, many developers, especially those coming from the Windows development world, have seen these log files and folders being placed under the application path folders. (<span style=\"color: rgb(51, 153, 102);\">NOTE<\/span>: Unix systems compartmentalize their files by type rather then application as Windows does. In Windows applications are typically installed in this manner: <strong>&lsquo;C:\\Program Files\\Application\\All files for application go here&rsquo;<\/strong>. Under UNIX you have <strong>&lsquo;\/etc\/&rsquo;<\/strong> for config files of the application, <strong>&lsquo;\/var\/log&rsquo;<\/strong> for log files application generates when running, <strong>&lsquo;\/bin\/&rsquo;<\/strong> and <strong>&lsquo;\/usr\/bin&rsquo;<\/strong> where the executable files of the application reside, for all applications. So under your Linux box, all log files will be under <strong>&lsquo;\/var\/log&rsquo;<\/strong> including the most important system log files:\n<\/p>\n<ol>\n<li>\n\t\t<strong>\/var\/log\/messages<\/strong> &#8211; Main Linux \/ Unix system log file. Holds the bulk of the system log information.\n\t<\/li>\n<li>\n\t\t<strong>\/var\/log\/secure<\/strong> &#8211; Main secure log file. Saves network \/ firewall and other NET related log information.\n\t<\/li>\n<li>\n\t\t<strong>\/var\/log\/dmsg<\/strong> &#8211; BOOT log file. Essentially what happened during boot up time. Great for first diagnosis when something doesn&rsquo;t work during boot up.\n\t<\/li>\n<li>\n\t\t<strong>\/var\/log\/maillog<\/strong> &#8211; If you use sendmail or any other mail server\/service on UNIX this log file will hold much of the information you need.\n\t<\/li>\n<li>\n\t\t<strong>\/var\/log\/acpid<\/strong> &#8211; Power state and power management log file.\n\t<\/li>\n<li>\n\t\t<strong>\/var\/log\/cron<\/strong> &#8211; Scheduled cron jobs and any jobs that have been scheduled to run through the Unix \/ Linux &lsquo;cron&rsquo; utility job scheduler.\n\t<\/li>\n<\/ol>\n<p>\n\tOf course, the log files are only as good as the application writing to them. If the application doesn&rsquo;t write log files detailed enough to diagnose problems, obviously, the log files won&rsquo;t help you much when you run into issues.&nbsp; The above utilities\/locations will tell you plenty about your system health. One command line terminal program that I find usefull in checking into my system workings is <strong>&lsquo;konsole&lsquo;<\/strong>. This is just one variation of many <a href=\"http:\/\/microdevsys.com\/wp\/konsole-dynamically-changing-the-tab-title-text\/\">dozens of GUI based Linux consoles<\/a>, however I find this one most useful due to it&rsquo;s behavior like a browser allowing you to open multiple windows to the system in tabs and that&#39;s <strong>&#39;konsole&#39;<\/strong> is scriptable.\n<\/p>\n<p>\n\t<!--nextpage-->\n<\/p>\n<p>\n\t<u><strong>NETSTAT<\/strong><\/u><br \/>\n\tNetstat is an extensive network traffic diagnostic utility printing local\/foreign IP addresses port and other connections within and from outside to your workstation. The most common I use is <strong>&lsquo;netstat -apntee&lsquo;<\/strong> which prints information as below:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t$ <strong>netstat -apntee<\/strong><br \/>\n\tActive Internet connections (servers and established)<br \/>\n\tProto Recv-Q Send-Q Local Address Foreign Address State User Inode PID\/Program name<br \/>\n\ttcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 6535 2076\/portmap<br \/>\n\ttcp 0 0 127.0.0.1:50000 0.0.0.0:* LISTEN 0 6965 2228\/hpiod<br \/>\n\ttcp 0 0 127.0.0.1:50002 0.0.0.0:* LISTEN 0 6997 2233\/python<br \/>\n\ttcp 0 0 0.0.0.0:21 0.0.0.0:* LISTEN 0 7112 2277\/vsftpd<br \/>\n\ttcp 0 0 192.168.1.1:53 0.0.0.0:* LISTEN 25 6500 2058\/named<br \/>\n\ttcp 0 0 192.168.0.4:53 0.0.0.0:* LISTEN 25 6498 2058\/named<br \/>\n\ttcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 25 6496 2058\/named<br \/>\n\ttcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 0 7019 2244\/cupsd<br \/>\n\ttcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 0 7204 2306\/sendmail: acce<br \/>\n\ttcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN 25 6503 2058\/named<br \/>\n\ttcp 68 0 192.168.0.4:55890 74.208.30.232:21 CLOSE_WAIT 0 10234 2883\/gftp-gtk<br \/>\n\ttcp 0 0 :::80 :::* LISTEN 0 7286 2328\/httpd<br \/>\n\ttcp 0 0 :::22 :::* LISTEN 0 7054 2256\/sshd<br \/>\n\t$\n<\/p>\n<p>\n\tHere is the brakedown of what the values mean (You can get more detailed meaning form using <strong>&lsquo;netstatp &ndash;help&lsquo;<\/strong> or <strong>&lsquo;man netstat&lsquo;<\/strong>)\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>&lsquo;-n&lsquo;<\/strong> This means print entries in numeric format (IP addresses instead of host numbers)<br \/>\n\t<strong>&lsquo;-a&lsquo;<\/strong> Show listening and non listening sockets.<br \/>\n\t<strong>&lsquo;-p&lsquo;<\/strong> Show program name and PID of process owning the connection.<br \/>\n\t<strong>&lsquo;-t&lsquo;<\/strong> Do not trim long addresses (Do NOT truncate on output).<br \/>\n\t<strong>&lsquo;-e&lsquo;<\/strong> Print additional detailed information. Use &lsquo;-ee&lsquo; for maximum detail.\n<\/p>\n<p>\n\tThe <strong>&lsquo;-c&lsquo;<\/strong> option with <strong>&lsquo;netstat&lsquo;<\/strong> is also useful if you want to display the information on the screen and have it refreshed periodically. An alternative is to use <strong>&lsquo;watch -n 5 &ldquo;netstat -apntee&rdquo;&lsquo;<\/strong> instead which will refresh output and run <strong>&lsquo;netstat -apntee&lsquo;<\/strong> every 5 seconds per the <strong>&lsquo;-n&lsquo;<\/strong> option. This utility is very useful if you want to scan your system for connections coming into your system and coming out. This is a very useful command to use in the hosting industry.<\/p>\n<p>\t<u><strong>PS<\/strong><\/u><br \/>\n\tThis is probably one of the more usefull and powerfull UNIX commands available and virtually combines many most common and useful features of other Linux commands.&nbsp; However, it&#39;s power is often overlooked, and only simple variants of this command are used. This command checks the process table on a Unix host and prints information on it. The command can show you everything running on a host and often what is going on with a process, how it started, with what commands it started with and from where to list just a few of the details it can produce on a process running on a UNIX \/ Linux os. The best way to see what it can do is to actually view a few examples of the command:\n<\/p>\n<p>\n\t<strong>&lsquo;ps -axfwweo pid,user,cmd=&lsquo;<\/strong> Print ALL processes (<strong>-a<\/strong>), processes without controlling tty&rsquo;s (<strong>-x<\/strong>), show parent-child relationship (<strong>-f<\/strong>), print extended information (wide format) (<strong>-w<\/strong>), show environment (<strong>-e<\/strong>) and define\/customize output as follows:<br \/>\n\t<strong>&lsquo;pid&lsquo;<\/strong> Process PID.<br \/>\n\t<strong>&lsquo;user&lsquo;<\/strong> User that ran the process.<br \/>\n\t<strong>&lsquo;cmd=&lsquo;<\/strong> Command used to run process with.\n<\/p>\n<p>\n\tHere are a few more variations you may find usefull.\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>&lsquo;ps -axfeo pid,user,pcpu,pmem,ppid,psr,etime,cputime,cp,nice,rtprio,state,vsz,size,cmd=&lsquo;<\/strong>\n<\/p>\n<p>\n\tThe new options to <strong>&lsquo;-o&lsquo;<\/strong> for above command include:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>&lsquo;pmem&lsquo;<\/strong> Percent of memory used by a process.<br \/>\n\t<strong>&lsquo;pcpu&lsquo;<\/strong> Percent of CPU used by a process.<br \/>\n\t<strong>&lsquo;ppid&lsquo;<\/strong> Parent process ID that started\/spawned this process.<br \/>\n\t<strong>&lsquo;psr&lsquo;<\/strong> Processor the process is running on.<br \/>\n\t<strong>&lsquo;etime&lsquo;<\/strong> Elapsed time since process was running.<br \/>\n\t<strong>&lsquo;cputime&lsquo;<\/strong> Cumulative CPU time.<br \/>\n\t<strong>&lsquo;cp&lsquo;<\/strong> Per millisecond CPU usage.<br \/>\n\t<strong>&lsquo;nice&lsquo;<\/strong> Priority with which the process is running.<br \/>\n\t<strong>&lsquo;rtprio&lsquo;<\/strong> Real time priority.<br \/>\n\t<strong>&lsquo;state&lsquo;<\/strong> The state the process is in (From &lsquo;man&lsquo; pages):\n<\/p>\n<p style=\"margin-left: 80px;\">\n\t<strong>D<\/strong> Uninterruptible sleep (usually IO)<br \/>\n\t<strong>R<\/strong> Running or runnable (on run queue)<br \/>\n\t<strong>S<\/strong> Interruptible sleep (waiting for an event to complete)<br \/>\n\t<strong>T<\/strong> Stopped, either by a job control signal or because it is being traced.<br \/>\n\t<strong>W<\/strong> paging (not valid since the 2.6.xx kernel)<br \/>\n\t<strong>X<\/strong> dead (should never be seen)<br \/>\n\t<strong>Z<\/strong> Defunct (&rdquo;zombie&rdquo;) process, terminated but not reaped by its parent.\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>&lsquo;vsz&lsquo;<\/strong> Virtual memory size of process in KiB (In multiples of 1024)<br \/>\n\t<strong>&lsquo;size&lsquo;<\/strong> Memory size in KiB (In multiples of 1024) NOTE: This number is an estimate and is very rough and therefore not exact.\n<\/p>\n<p>\n\t<strong>&lsquo;ps&lsquo;<\/strong> has an extensive man page (Type &lsquo;man ps&rsquo; for a full list of options possible). An example from the man pages is:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>&lsquo;ps -eo euser,ruser,suser,fuser,f,comm,label&lsquo;<\/strong>\n<\/p>\n<p>\n\twhich returns the security level of running processes. The <strong>&lsquo;ps&lsquo;<\/strong> utility is one of the utilities you can use to drill down or get very detailed information on a system problem short of doing a memory dump on a process that may be causing you issues.\n<\/p>\n<p>\n\t<u><strong>TOP<\/strong><\/u><br \/>\n\tIn it&rsquo;s simplest form <strong>&lsquo;top&lsquo;<\/strong> ran without parameters, will give you process information and top system resource users for your system. A parametarized version is <strong>&lsquo;top -cbn1&lsquo;<\/strong> which will print a single <em>snapshot<\/em> of a top output can be used if you do not need to see constantly updated information. Once in <strong>&lsquo;top&rsquo;<\/strong> press <strong>&lsquo;?&rsquo;<\/strong> to get a list of commands you can use with <strong>&lsquo;top&rsquo;<\/strong> while it is running. Use <strong>&lsquo;man top&lsquo;<\/strong> for command line option list or <strong>&lsquo;top &ndash;help&lsquo;<\/strong> for a brief list of options.\n<\/p>\n<p>\n\t<u><strong>UPTIME<\/strong><\/u><br \/>\n\tThis command, unlike the above, has only one option, <strong>-V<\/strong> to get version, and is usually ran without options on a command line to get a health report of a Unix system in the form of a <strong>&lsquo;load average&rsquo;<\/strong> number. The command is usefull for early and quick reporting of system health on systems either too busy or when other utilities fail to run sufficiently quickly.\n<\/p>\n<p>\n\t<strong><u>HDPARM<\/u><\/strong><br \/>\n\tShow\/set drive information\/settings respectively. To get read statistics on a drive to find out how busy it is, use <strong>&lsquo;hdparm -tT &lt;DEVICE&gt;&lsquo;<\/strong>. To get drive information including to check <strong>&lsquo;udma&lsquo;<\/strong> settings use <strong>&lsquo;hdparm -i &lt;DEVICE&gt;&lsquo;<\/strong>. To set or change a <strong>&lsquo;udma&lsquo;<\/strong> setting you can use something like this <strong>&lsquo;hdparm -Xudma5 -d1 &lt;DEVICE&gt;&lsquo;<\/strong>. <strong>&lt;DEVICE&gt;<\/strong> stands in for <strong>&lsquo;\/dev\/hda&lsquo;<\/strong>, <strong>&lsquo;\/dev\/hdb&lsquo;<\/strong> etc. or the actual hard drive you wish to check on on your system.\n<\/p>\n<p>\n\t<strong><u>HDDTEMP<\/u><\/strong><br \/>\n\tShow\/report drive temperature (if drive has sensor for temperature to begin with). Example run without parameters:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>$ hddtemp \/dev\/hdb<\/strong><br \/>\n\t\/dev\/hdb: WDC WD1200JB-00GVA0: 42&deg;C<br \/>\n\t$\n<\/p>\n<p>\n\tUse <strong>&lsquo;hddtemp &ndash;help&lsquo;<\/strong> for more available options.\n<\/p>\n<p>\n\t<!--nextpage--><strong><u>SMARTCTL<\/u><\/strong><br \/>\n\tReport on \/ show S.M.A.R.T. controller information\/settings for a drive. Use <strong>&lsquo;smartctl -a &lt;DEVICE&gt;&lsquo;<\/strong> where <strong>&lt;DEVICE&gt;<\/strong> is the actual name of your device you wish to check on. This also prints all errors the device has encountered so is a very useful tool in determining if you should be replacing a hard drive soon. In my case it reported 32 errors on a drive telling me I will need to be changing it soon. \ud83d\ude42\n<\/p>\n<p>\n\t<u><strong>FREE Memory using &lsquo;free&rsquo; &amp; &lsquo;cat \/PROC\/MEMINFO&rsquo;<\/strong><\/u><br \/>\n\tThe command <strong>&lsquo;free&lsquo;<\/strong> printed system memory usage including swap space utilization:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>$ free<\/strong><br \/>\n\ttotal used free shared buffers cached<br \/>\n\tMem: 515716 445576 70140 0 38368 176252<br \/>\n\t-\/+ buffers\/cache: 230956 284760<br \/>\n\tSwap: 2530196 0 2530196<br \/>\n\t$\n<\/p>\n<p>\n\tAlternately, typing &lsquo;cat \/proc\/meminfo&lsquo; displays detailed memory information. This holds more detailed memory usage analysis\/brake down then does &lsquo;free&lsquo; including categorization as is done on typical UNIX \/ Linux hosts.\n<\/p>\n<p>\n\t<strong><u>PSTREE<\/u><\/strong><br \/>\n\t<strong>&lsquo;pstree&lsquo;<\/strong> in it&rsquo;s simplest format prints commands and their parent-child relationship to each other. This command is usefull if there is a runaway process spawning too many instances of another process. The command has a limited subset of options available by running <strong>&lsquo;pstree &ndash;help&lsquo;<\/strong>.&nbsp; It is a simplistic command primarily designed to show process associations.\n<\/p>\n<p>\n\t<u><strong>LOAD<\/strong><\/u><br \/>\n\tThis is not a command but a concept. Load is an index starting at <strong>0.00<\/strong> that indicates system activity. On a clean system this should not go over <strong>1.00<\/strong> however in some applications, this value may be tolerated when significantly higher. Values of 1 generally means users of a system will see noticeable delays in their tasks. One command that prints load is <strong>&lsquo;uptime&lsquo;<\/strong> typed without parameters that was discussed earlier.<\/p>\n<p>\t<u><strong>DMESG<\/strong><\/u><br \/>\n\tTyping <strong>&lsquo;dmesg&lsquo;<\/strong> on a command line prints boot up error\/messages encountered when the system was started up.\n<\/p>\n<p>\n\t<u><strong>SAR<\/strong><\/u><br \/>\n\t<strong>&lsquo;sar&lsquo;<\/strong> is a System Activity Reporting tool\/command. Typed without parameters, it shows analysis of system processes and resource usage history. This is especially useful when determining the root causes of a slowdown on a Unix\/Linux host. As with <strong>&lsquo;ps&lsquo;<\/strong> this utility has an extensive documentation and comes with many options. A variation is:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t$ <strong>sar -n DEV<\/strong>\n<\/p>\n<p>\n\tWhich shows network activity on a host for a range of time. &lsquo;sar&lsquo; is used extensively in the industry and has a detailed man page of available options.\n<\/p>\n<p>\n\t<strong><u>IFCONFIG<\/u><\/strong><br \/>\n\tShows network card (NIC \/ Network Interface Card) statistics and settings including errors on the card. Example:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t<strong>$ ifconfig<\/strong><br \/>\n\tlo Link encap:Local Loopback<br \/>\n\tinet addr:127.0.0.1 Mask:255.0.0.0<br \/>\n\tinet6 addr: ::1\/128 Scope:Host<br \/>\n\tUP LOOPBACK RUNNING MTU:16436 Metric:1<br \/>\n\tRX packets:1931 errors:0 dropped:0 overruns:0 frame:0<br \/>\n\tTX packets:1931 errors:0 dropped:0 overruns:0 carrier:0<br \/>\n\tcollisions:0 txqueuelen:0<br \/>\n\tRX bytes:3039216 (2.8 MiB) TX bytes:3039216 (2.8 MiB)<br \/>\n\t$\n<\/p>\n<p>\n\tThis is especially usefull when you are experiencing network issues. If there are problems with your network card, the errors can be visible after typing this command.\n<\/p>\n<p>\n\t<u><strong>MPSTAT<\/strong><\/u><br \/>\n\tShows processor\/CPU activity on a host for a specified time. <strong>&lsquo;mpstat&lsquo;<\/strong> without parameters shows the current snapshot of CPU related usage information. A variation of the command is <strong>&lsquo;mpstat &ndash;a 1 1&lsquo;<\/strong>. Basic usage information is:\n<\/p>\n<p>\n\tUsage: mpstat [ options&#8230; ] [ <interval> [ <count> ] ]\n<\/p>\n<p>\n\t<u><strong>IOSTAT<\/strong><\/u><br \/>\n\t<strong>&lsquo;iostat&lsquo;<\/strong> works similarly to <strong>&lsquo;mpstat&lsquo;<\/strong> but reports usage statistics on a drive. <strong>&lsquo;iostat \/dev\/hdb ALL 1 1&prime;<\/strong> shows more detailed information taken at interval of <strong>1<\/strong> second. Basic usage and options are:\n<\/p>\n<p style=\"margin-left: 40px;\">\n\t[ -c | -d ] [ -k | -m ] [ -t ] [ -V ] [ -x ]<br \/>\n\t[ <device> [ &#8230; ] | ALL ] [ -p [ <device> | ALL ] ]\n<\/p>\n<p>\n\tMore detailed information is available by &lsquo;man iostat&lsquo;, &lsquo;info iostat&lsquo; or &lsquo;iostat &ndash;help&rsquo;.\n<\/p>\n<p>\n\t<u><strong>VMSTAT<\/strong><\/u><br \/>\n\t<strong>&lsquo;vmstat&lsquo;<\/strong> works similar to <strong>&lsquo;iostat&lsquo;<\/strong> however reports information on memory utilization. Without any options <strong>&lsquo;vmstat&lsquo; <\/strong>prints memory utilization and may be considered more accurate in it&rsquo;s reporting then <strong>&lsquo;free&lsquo;<\/strong>. A number of options are available and can be seen by running <strong>&lsquo;man vmstat&lsquo;<\/strong>.\n<\/p>\n<p>\n\t<u><strong>LSOF<\/strong><\/u><br \/>\n\t<strong>&lsquo;lsof&lsquo;<\/strong> is short for <strong>&lsquo;ls&rsquo;<\/strong> (list) Open Files. This command shows all currently open files on a host. This command is ideal if you want to know which process has what files open. This is especially useful in identifying commands or files being used outside their allowed locations or command using old outdated files that could be causing problems. This command is very useful since it links PID or processes with files they have open, information which is not readily available from other commands. <strong>&lsquo;lsof&lsquo;<\/strong> has a huge number of parameters that rivals <strong>&lsquo;ps&lsquo;<\/strong>.\n<\/p>\n<p>\n\t<strong><u>CORE FILES<\/u><\/strong><br \/>\n\tThis feature of UNIX \/ Linux systems is essentially a memory\/library dump of running jobs that could be causing issues. The core files are extensively use in pin pointing problems in crashing or hung applications and are essential to diagnosis by developers. The topic of <strong>core files<\/strong> is far too extensive to cover here and will be covered in a separate topic in the future.\n<\/p>\n<p>\n\t<!--nextpage--><strong><u>EXAMPLE<\/u><\/strong><br \/>\n\tAs an example of a case that you may run into, which coincidently I have ran into on Linux distros as far back as I can remember is with the Flash Plug in technology and available browsers in Linux. Once you install the flash plugin following the directions in the readme file that comes with the Flash plugin, you&rsquo;ll, unfortunately, notice the plugin is significantly less powerfull for Linux then Windows. For one, the version of flash available for Linux will be older then the latest one available for Windows. The second limitation, is that the other player from Adobe, Shockwave, is not available for FireFox at the time of this writing. So often enough, you&rsquo;ll run into situations online where you can&rsquo;t view the latest flash of a site despite the fact that you have flash installed. And here is where the issue may popup for you on your Linux distribution. At the point where flash is starting up, FirsFox would repeatedly freeze on me to the point where it&rsquo;s unusuable. Coincidently, so does my entire system. Since this happened too often for me, I followed this procedure:\n<\/p>\n<ul>\n<li>\n\t\tI started <strong>&lsquo;konsole&lsquo;<\/strong> . This took long enough on it&rsquo;s own since my system was now crawling due to the lock up (And what I thought was FireFox to be causing it)\n\t<\/li>\n<li>\n\t\tI typed <strong>&lsquo;uptime&lsquo;<\/strong> as this is simpler and requires less resources then other commands so it typically returns results on how busy my system is, quicker then most other commands, which is what I wanted at this point.\n\t<\/li>\n<\/ul>\n<p style=\"margin-left: 80px;\">\n\t<strong>$ uptime<\/strong><br \/>\n\t18:29:04 up 1:03, 3 users, load average: 0.48, 0.26, 0.14<br \/>\n\t$\n<\/p>\n<ul>\n<li>\n\t\tThis told me the summary average value of how busy my system was. In my case it was about <strong>4<\/strong>, which, if you don&rsquo;t have much familiarity with load averages at this point, is quite high. A value of 1 typically means 100% busy and higher means you&rsquo;ll start to see significant delays in how fast your applications will respond to you. Typically, your system will be usable under values higher then <strong>1<\/strong> but you&rsquo;ll start to see delays you might not like or be comfortable with. For this reason it&rsquo;s good to keep your load average under <strong>1<\/strong> at all times.\n\t<\/li>\n<\/ul>\n<ul>\n<li>\n\t\tI needed to find out what is causing the most CPU usage at this point and therefore causing the high load. To find this out I typed <strong>&lsquo;top&lsquo;<\/strong>.\n\t<\/li>\n<li>\n\t\tOnce in <strong>&lsquo;top&lsquo;<\/strong>, I typed <strong>&lsquo;?&lsquo;<\/strong> to get to <strong>&lsquo;top&rsquo;<\/strong> help since I forgot the command that sorts the fields by <strong>% CPU<\/strong>. Found the command is <strong>&lsquo;O&lsquo;<\/strong> (capital &lsquo;ohh&rsquo;) and pressed <strong>&lsquo;Esc&lsquo;<\/strong> key to exit the help menu.\n\t<\/li>\n<li>\n\t\tNow that I was back to the main &lsquo;top&lsquo; screen, I typed <strong>&lsquo;O&lsquo;<\/strong> which opened up a menu with options for sorting I could use.\n\t<\/li>\n<li>\n\t\tI typed <strong>&lsquo;K&lsquo;<\/strong> from the above list, which sorted output of <strong>&lsquo;top&lsquo;<\/strong> by <strong>CPU %<\/strong> usage.\n\t<\/li>\n<li>\n\t\tHaving sorted the <strong>&lsquo;top&rsquo;<\/strong> output by CPU usage, it was now apparent it was the FireFox binary <strong>&lsquo;firefox-bin&rsquo; <\/strong>executable, (FireFox) causing the highest CPU usage.\n\t<\/li>\n<li>\n\t\tAt this point I allowed the <strong>&lsquo;top&lsquo;<\/strong> refresh a few times to ensure ONLY <strong>&lsquo;firefox-bin&rsquo;<\/strong> was causing the high load and <strong>CPU %<\/strong> usage, which it turned out that it was. The reason for this is that <strong>ALL<\/strong> applications use 100% of CPU at some point or another however only those that use <strong>100% of CPU<\/strong> constantly for any long period of time will cause any appreciable system load.\n\t<\/li>\n<li>\n\t\tAt this point, I wanted to see if the drive itself isn&rsquo;t also contributing to the high load. For this I used <strong>&lsquo;hdparm -tT \/dev\/hdb&lsquo;<\/strong> :\n\t<\/li>\n<\/ul>\n<p style=\"margin-left: 120px;\">\n\t<strong>$ hdparm -tT \/dev\/hdb<\/strong><br \/>\n\t\/dev\/hdb:<br \/>\n\tTiming cached reads: 2148 MB in 2.00 seconds = 1073.85 MB\/sec<br \/>\n\tTiming buffered disk reads: 150 MB in 3.23 seconds = <strong><span style=\"color: rgb(51, 153, 102);\">46.49 MB\/sec<\/span><\/strong><br \/>\n\t$\n<\/p>\n<ul>\n<li>\n\t\tThis command tells me how fast it could read to and from a disk. Typical speeds for IDE I found were around the 50MB\/s mark, which is very good and about 60MB\/s for SATA. These speeds are very good. If you see less then 5MB\/s then you can be certain there is something writing to your disk. Another reason for low IDE speeds is due to bad drive settings or IDE cable problems, or wrong cables being used on drives. For more on this, <strong>&lsquo;hdparm&lsquo;<\/strong> and what it can do for you, you can read the &lsquo;What Drives your Linux installation? (And <strong>&lsquo;hdparm&lsquo;<\/strong>, <strong>&lsquo;hddtemp&lsquo;<\/strong>, <strong>&lsquo;smartctl -a &lt;DEVICE&gt;&lsquo;<\/strong> and <strong>&lsquo;syslog&lsquo;<\/strong> in <strong>&lsquo;\/var\/log\/messages&lsquo;<\/strong>) article here.<br \/>\n\t\t&nbsp;\n\t<\/li>\n<li>\n\t\t(<span style=\"color: rgb(51, 102, 255);\"><strong>NOTE<\/strong><\/span>: For this step I could have also used <strong>&lsquo;smartctl -a \/dev\/hdb&lsquo;<\/strong> to get other drive information. This utility also prints detailed drive information and statistics which can be used in diagnosis.)\n\t<\/li>\n<li>\n\t\t&nbsp;\n\t<\/li>\n<li>\n\t\tI was relatively certain then, that the load is caused nearly purely by excessive <strong>%CPU<\/strong> usage. I wasn&rsquo;t sure how it&rsquo;s causing this high CPU usage other then the fact I loaded a Flash site but I wasn&rsquo;t interested in the exact details since most likely, even if I did find out the reason, neither <strong>Adobe<\/strong> nor <strong>FireFox<\/strong> developers would do anything right away to resolve the issue within a couple of days or even weeks to do me any good.\n\t<\/li>\n<\/ul>\n<ul>\n<li>\n\t\tAt this point I knew the problem was with <strong>&lsquo;firefox-bin&lsquo;<\/strong> using excessive CPU resources. To verify that memory usage isn&rsquo;t the culprit as well, I went back into <strong>&lsquo;top&lsquo;<\/strong> to check on this. I did this by typing <strong>&lsquo;top&lsquo;<\/strong> and this time used <strong>&lsquo;O&lsquo;<\/strong> followed by <strong>&lsquo;n&lsquo;<\/strong> to sort by <em>memory % usage<\/em>. This wasn&rsquo;t an issue and top memory usage from running <strong>&lsquo;free&lsquo;<\/strong> on the command line confirmed this. Another way to find out if memory usage is a problem you can sort by either <strong>&lsquo;VIRT&rsquo;<\/strong>, <strong>&lsquo;RES&rsquo;<\/strong> or <strong>&lsquo;SHR&rsquo;<\/strong> in <strong>&lsquo;top&lsquo;<\/strong> which can tell you more details on usage. In general looking into memory usage beyond a <strong>&lsquo;general indication&rsquo;<\/strong>, is a whole new topic and too vast for this modest topic here. \ud83d\ude42\n\t<\/li>\n<\/ul>\n<p>\n\tAll in all, any information viewed or retrieved by unix\/linux commands is interpretive since memory usage is often shared by other modules in memory, meaning that what you see in utilities is not <strong>&lsquo;really&rsquo;<\/strong> the amount of memory an application is using but most likely significantly less. Another thing I could have done is to look into <strong>&lsquo;proc\/meminfo&lsquo;<\/strong> by running <strong>&lsquo;cat \/proc\/meminfo&lsquo;<\/strong> which would tell me more results and a brakedown on memory usage on my system.<br \/>\n\t&nbsp;\n<\/p>\n<p>\n\t<strong>SOLUTION<\/strong>\n<\/p>\n<p>\n\tBecause I did not want to drift off of what I was doing and start chasing this problem in detail, I just needed to resolve the issue, get my browser to respond again and go about my business. Because I could not exit from <strong>FireFox<\/strong>, being frozen and all, I decided to kill it instead. You can also do this from <strong>&lsquo;top&lsquo;<\/strong> by following the below steps:<br \/>\n\t&nbsp;\n<\/p>\n<ol>\n<li>\n\t\tRun <strong>&lsquo;top&lsquo;<\/strong> from command line and type <strong>&lsquo;k&lsquo;<\/strong>. This will ask you which PID you want to kill.\n\t<\/li>\n<li>\n\t\tType the <strong>&lsquo;PID&rsquo;<\/strong> (Process ID) of the item you want to kill. The PID&rsquo;s are available from the first column of <strong>&lsquo;top&rsquo;<\/strong>.\n\t<\/li>\n<li>\n\t\tOnce you type the PID, hit enter. At this point, you should no longer see the particular process (In this case <strong>&lsquo;firefox-bin&rsquo;<\/strong>) running and you should see a gradual drop in load and a more responsive system.\n\t<\/li>\n<li>\n\t\tRestart FireFox and avoid browsing to the problematic site. \ud83d\ude42\n\t<\/li>\n<li>\n\t\tAnother topic you can view on how to avoid this situation and having your system &lsquo;over loaded&rsquo; by a rogue issues is to look for my article here on <span style=\"color: rgb(51, 204, 204);\">&lsquo;The effects and consequences of prioritizing Linux applications&rsquo;<\/span> and <span style=\"color: rgb(51, 204, 204);\">&#39;Server optimization&#39;<\/span> articles in the near future.\n\t<\/li>\n<\/ol>\n<p>\n\t&nbsp;\n<\/p>\n<p>\n\tThis method of investigation may be helpful to you in most load related issues. Of course, most of the other utilities listed here can also assist if you find the above brief in investigation example not sufficient to identify a problem on your system.&nbsp;\n<\/p>\n<p>\n\tCheers!<br \/>\n\tTom K.<\/p>\n\n    <div class=\"xs_social_share_widget xs_share_url after_content \t\tmain_content  wslu-style-1 wslu-share-box-shaped wslu-fill-colored wslu-none wslu-share-horizontal wslu-theme-font-no wslu-main_content\">\n\n\t\t\n        <ul>\n\t\t\t        <\/ul>\n    <\/div> \n","protected":false},"excerpt":{"rendered":"<p>One of the more powerfull features of any Linux\/Unix system are the monitoring features and capabilities of the OS. UNIX systems in general have been around for decades and in that time the tools available to users and admins alike on both systems have grown tramendously. In fact, the log files, tools and applications under [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_uf_show_specific_survey":0,"_uf_disable_surveys":false,"footnotes":""},"categories":[3],"tags":[218,209,233,227,171,230,220,217,215,214,110,222,224,226,228,223,210,231,157,156,172,211,219,221,216,234,229,212,232,155,213,225],"class_list":["post-10","post","type-post","status-publish","format-standard","hentry","category-unix-linux-admin-stuff","tag-procmeminfo","tag-varlog","tag-analysis","tag-core","tag-cpu","tag-disk-io","tag-dmesg","tag-free","tag-hddtemp","tag-hdparm","tag-history","tag-ifconfig","tag-iostat","tag-lsof","tag-monitoring","tag-mpstat","tag-netstat","tag-network-io","tag-optimize","tag-performance","tag-processor","tag-ps","tag-pstree","tag-sar","tag-smartctl","tag-temperature","tag-time-slice","tag-top","tag-trend","tag-tweak","tag-uptime","tag-vmstat"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/posts\/10","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/comments?post=10"}],"version-history":[{"count":0,"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/posts\/10\/revisions"}],"wp:attachment":[{"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/media?parent=10"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/categories?post=10"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/microdevsys.com\/wp\/wp-json\/wp\/v2\/tags?post=10"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}