MoSheL - Lightweight server monitoring
https://www.wyae.de/software/moshel/
Volker Tanger 6575859427 corrected bugfix - check fails if nonnumeric value | 4 months ago | |
---|---|---|
data | 1 year ago | |
.gitignore | 1 year ago | |
LICENSE_EUPL-1.2_EN.txt | 8 months ago | |
README.md | 6 months ago | |
cron.d_moshel | 1 year ago | |
functions.moshel | 4 months ago | |
gen_compares | 1 year ago | |
history.txt | 4 months ago | |
moshel.example | 6 months ago |
MoSheL - MOnitoring with SHEll Locally
2020- by Volker Tanger <volker.tanger@wyae.de>
MoSheL (MOnitoring with SHEll Locally) is a simple, lightweight (both in size and system requirements) server monitoring package designed for secure and in-depth monitoring of single or multiple typical/critical internet systems.
As most of the servers/services I want to monitor are remote systems, traditional NMS (relying on close-looped and/or unencrypted sessions) are either big, complicated to install for safe remote monitoring, ressource intense (when doing remote checks), lack a status history or a combination thereof.
Thus I (re)wrote this small, easily configured system. It originally was intended for monitoring of single a handful of typical internet systems. After 17 years after the first installment and 11 years after the big rewrite this is the next iteration: even smaller, more decentralized, easier to adapt to local peculiarities (or changes in output formats).
MoSheL supports email alerts (with flapping-prevention) out of the box - and whatever you can script.
The system is programmed in plain (Bourne) SH, and to be compatible with BASH, ZSH an various others, so it can easily be deployed on most unixish systems.
Monitoring is designed to be distributed over multiple systems, usually running locally. As no parameters are accepted from outside, checks cannot be tampered or misused from outside.
For centralized monitoring usually only aggregate data (i.e. the current, non-okay status) is transferred, easing network load a lot. Agent data is transferred via HTTP(S) in pull-mode - which comes in handily if you are moinitoring web servers anyway. So no additional network daemons or weaknesses. As agent queries are only downloading locally generated static files, the possible attack surface is minimal.
Requirements for MoSheL:
* Unix Shell (tested with Bourne-Shell and BASH)
* awk + standard Unix text tools (fgrep, mail, time, date, ...)
for single checks only if performed: depending on what you want to check - see the "moshel.example" script
for web interface:
* webserver - which can server static files (= nearly any)
* the "dygraphs" JavaScript library. Included in the archive
within the /data/ directory
Hardware requirements:
Should be minimal. 39 checks on my 1vCore mailserver (including
40% checks of remote systems) take 3.5s (.75 CPU-seconds).
RaspberryPi with SD card needs 34s (1.3 CPU-seconds) for the
same checks.
A central monitoring system loads a single static file snippet
from each monitored server - usually a few hundred bytes every
five minutes.
The system is a shell script, so no big size components here,
either. For a webserver (nearly) any HTTPD is fine. No
database needed - everything is plain text.
Updates will be available at http://www.wyae.de/software/moshel/ and at the GIT repository at https://git.wyae.de/WYAE/moshel
Please check there for updates prior to submitting patches!
For bug reports and suggestions or if you just want to talk to me please contact me at volker.tanger@wyae.de
Get and untar the archive - usually into /usr/local/lib/moshel
copy the whole data/ directory to WWWDIR (see below)
Edit the MOSHEL file and set the environment
MYNAME HOSTname of this server
ALERTTO mail address where aerts should be sent to
WEBURL the URL the snippet, overview and data files are
available under
WWWDIR where the HTML reports and status file are saved to
CMPDIR location the tamper-detection files are copied to
MAINTENANCE maintenance message
disables +all+ checks if not empty
Copy the moshel.example shell script to moshel file and configure the checks to be run - usually you can set alert trigger levels
Adapt the "moshel" script.
Run it locally after each configuration change to make sure your command and match pattern does not contain any errors.
Place the cron.d_moshel file to /etc/cron.d/moshel, maybe adapt it accordingly so moshel is called periodically.
Via the web interface you can view the overall status - full and
abbreviated status. But you cannot modify anything - which makes it
quite safe for even non-admin multiuser use...
;-)
You can group checks for output with the command
Category "HEADLINE"
All other checks follow the pattern
CheckSOMETHING
Name without spaces
'command -opt' command with all parameters - without pipes!
'AWK pattern' AWK pattern, see "man awk", usually
'/regex/{ print $POS }' where POS is the nth value
in the result return line matching the regex
value value the result is compated to
'alert message' The message printed whenever the check fails
Checks are named / to be read as "expected" or "should-be". If the check fails / exceeds the value, its status is changed. Whenever a status changes to "alert" for 2 consecutive checks, an email alert is sent.
CheckValueOver Number value should be over the one given CheckValueUnder Number value should be under the one given CheckCountLessThan count the number of lines the check yields,
they should be below the number given
CheckCountMoreThan count the number of lines the check yields,
they should be above the number given
CheckFileChanges The file should match the comparison file.
Generate the comparison files with
./gen_compares with the moshel directory
If something breaks (e.g. FCGI hickups between httpd and PHP process), then MoSheL can run actions if the problem occurs.
Use this command AFTER "CleanUp" at the end of the script - the CleanUp is needed to ensure the check is not just flapping but "really" failing two consecutive runs.
The command follows the pattern:
ActionOnAlert CHECKNAME 'COMMAND TO BE RUN'
Well, a biiig name for what it does. Add a line to your /usr/local/lib/moshel/moshel file on the "central monitoring server"
Centralize URL1/moshel URL2/moshel ...
where you list all URLs to the MoSheL directories files of all "centrally monitored" servers.
You can find the summaries at URL/moshel/summary.html
For the "graphs" summary you will need to set the CORS headers accordingly (if you want to risk it).
Nearly all monitoring systems use shell scripts to extract system data, add wrappers around them, and then more and more layers. So changes in output format or differences between systems or versions break checks, so you have to constantly adapt and create multi-case-aware wrappers.
MoSheL was designed to do away with the overhead and the wrappers. You can (and should) be able to work with your own commands and outputs - and correct pattern matching whenever stuff changes.
If you have a nice (free) check that could be of use to other people, please send it to me so I can include it into the distribution.
This is free software - see attached file LICENSE_EUPL-1.2_EN.txt and available in other languages under https://joinup.ec.europa.eu/collection/eupl/eupl-text-eupl-12
Copyright (C) 2020- Volker Tanger