ERM is a general-purpose network management engine written in Java. I've created it because I found other open-source tools like nagios or opennms awkward at best. Plus, I find it easier to write monitoring extensions for java tools (weblogic, jboss, tomcat, standalone apps ...) in java.
Anything that can run a java virtual machine 1.4+.
Unzip the downloaded file. What you get should look like this:
lib/ dist/ conf/ erm.sh erm.cmd lcp.cmdThese are the required JAR files, the XML config files and some helper scripts to run it on Unix and Windows. Basically, the installation is now done. Under Unix, you also have to
chmod +x erm.sh
before
you can run it.
This is quite simple. In the conf directory there are 3 XML files containing all the required configuration data.
You should edit them according to your needs then run EMR and forget about it.
monitoring.xml: a list of probes you want to launch on your network and their execution interval.
class of the probe <probe interval="120">net.sf.erm.task.weblogic.WLMemoryMonitoringProbe</probe> ^^^ seconds count between two executions of this probeProbes are executed on all devices in the network that have support for them.
<device name="192.168.200.253"> <module port="161"> <supported>net.sf.erm.task.Mib2Task</supported> <supported>net.sf.erm.task.OldCiscoConfigCopyTask</supported> <supported>net.sf.erm.task.OldCiscoImageCopyTask</supported> </module> </device>config.xml: everything that could not fit in the 2 previous files. This includes SNMP communities, TFTP server root directory, data collector in use and more.
The work of ERM is split into 3 layers: tasks, probes and data collector.
Tasks represents a single logical operation that can be performed on the device. Either collecting data
or acting on it. Example of tasks are the OldCiscoConfigCopyTask which orders a cisco router via SNMP to upload
its running config on the embedded TFTP server and the WLMemoryMonitoringTask which fetches from a weblogic server
the maximum VM heap size and the current free heap memory. Tasks just know HOW to perform their action.
Data collector is where the collected information is sent. It is responsible for collecting data to some repository.
Repository can be an RDBMS, filesystem or more simply the stdout stream of the process or even an email recipient.
Data collector knows WHAT to do with the collected data. Detected outages are also "collected" by the data
collector.
Probes are scheduled actions. At specified interval, the just run on every single device that is able to support
them and send eventually collected informations to the data collector. Probes have dependencies on one or more
tasks to be able to run. A device that supports all the tasks required by a probe will have that probe automatically
run on it. Probes know WHEN to perform operations
ERM is just a collection of services (like SNMP engine or TFTP server) that are able run tasks and send their
result wherever is needed. Right now, included tasks are:
CREATE TABLE "OUTAGES" ( "DEVICE" VARCHAR(80) NOT NULL, "CATEGORY" VARCHAR(80) NOT NULL, "STAT_NAME" VARCHAR(80) NOT NULL, "SUBJECT" VARCHAR(250) NOT NULL, "REASON" VARCHAR(4000), "COLLECT_DATE" TIMESTAMP NOT NULL ); CREATE TABLE "STATS" ( "DEVICE" VARCHAR(80) NOT NULL, "CATEGORY" VARCHAR(80) NOT NULL, "STAT_NAME" VARCHAR(80) NOT NULL, "SUBJECT" VARCHAR(250) NOT NULL, "VALUE" INTEGER, "COLLECT_DATE" TIMESTAMP NOT NULL );
Be welcome ! You should simply start by having a look at other classes already written. These extensions should be as easy to write as possible so if you encounter problems, please let me know.