I was interested in generating graphs within the MRTG/Routers2 monitoring system that display the number of hops for an IP connection through the Internet. In my opinion its interesting to see the different routing run times/hop counts e.g. for remote offices that are connected via dynamic ISP connections such as DSL. Therefore, I wrote a small script that executes a traceroute command which can be called from MRTG.
Traceroute Script
Here comes my script “traceroute2mrtg” which needs the destination host as a parameter: traceroute2mrtg www.webernetz.net . It calls the “traceroute” command with a few options and stores the hop count in a variable. If the destination was not reachable for traceroute, the maximum hop count of “30” is change to a “0” to have the MRTG graphs to show *nothing* instead of *30* in this case.
I have moved this script to “/usr/local/bin” and, of course, made it executable: sudo chmod u+x traceroute2mrtg .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
#!/bin/bash ####################################################################### #Author: Johannes Weber (johannes@webernetz.net) # #Homepage: https://weberblog.net # #Last Modified: 2014-01-07 # ####################################################################### #Only one parameter for this tool: The destination dest=$1 #Basic traceroute with: #-I for using ICMP messages instead of UDP (needs root privileges) #-n since the DNS lookups are not needed here #-w 2 for only waiting 2 seconds until response for not slowing down MRTG to much #Error messages into 2>/dev/null #Piped into "tail -1" to have only the last line #Piped to awk to return only the number of hops hops=`traceroute -I -n -w 2 $dest 2>/dev/null | tail -1 | awk '{print $1}'` #If the destination is unreachable, traceroute stops after 30 hops. #In order to have an unreachable host displayed as "0" and not as "30", the value is changed in that case: if [ $hops == 30 ] then hops=0 fi #Output for MRTG (which basically needs 4 lines) echo $hops echo 0 #echo no uptime to report here #echo traceroute $dest |
Note that the reported values are NOT the actual hop count to the destination since the last line in a traceroute reveals the final destination. That is, the real hop count would be the reported value decremented by one. However, I am not changing this value by “-1” because I do not want to confuse myself when comparing the MRTG graphs with some other traceroute tests.
Here is an example of my script:
1 2 3 |
weberjoh@jw-vm01:~$ sudo traceroute2mrtg www.webernetz.net 11 0 |
MRTG/Routers2 Config
Following is my MRTG/Routers2 configuration part for the hop count. It mainly calls my script with the destination host “domain.name”.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Target[foobar_traceroute]: `/usr/local/bin/traceroute2mrtg domain.name` Title[foobar_traceroute]: Traceroute Hop Counts to domain.name #MaxBytes: Since the standard traceroute command stops after 30 hops, this value fits for the MaxBytes, too. MaxBytes[foobar_traceroute]: 30 Options[foobar_traceroute]: gauge Colours[foobar_traceroute]: BROWN#660000, YELLOW#FFD600, BLACK#000000, ORANGE#FC7C01 YLegend[foobar_traceroute]: Number of Hops Legend1[foobar_traceroute]: Hops Legend3[foobar_traceroute]: Peak Hops LegendI[foobar_traceroute]: Hops: ShortLegend[foobar_traceroute]: routers.cgi*ShortDesc[foobar_traceroute]: Traceroute Hop Count routers.cgi*Options[foobar_traceroute]: fixunit integer maximum nomax noo nopercentile nototal routers.cgi*Icon[foobar_traceroute]: graph-sm.gif |
Note that I specified the “maximum” option. That is, the weekly, monthly, and yearly graphs will show the maximum hop count for its ranges and NOT the averages with a maximum line above them. This is more useful if certain destinations are down for a longer time and MRTG stores “0”. If the maximum option would not be specified, the average values for a week would be much lower than realistic ones if the destination was not reachable for a few days, for example.
Also note that the average calculations on the graphs (the middle column with text) are NOT realistic if a destination is not reachable for a certain amount of time. It is only meaningful if the destination was reachable all the time. This behaviour cannot be changed here since there is no option to tell MRTG/Routers2 to “not display the Avg values if the values return a zero”.
Sample Graphs
Here are a few examples from my monitoring system. The first one reveals that the hop count to the destination changes incessantly between 15 and 16 hops. Here, the calculated average value of 15 makes sense:
The second one presents the hop count to a dynamic ISP connection which restarts every night. The hop count switches between 13 and 15 hops:
The last one shows the hop counts to one of the domains of the NTP Pool Project, in this case, 0.de.pool.ntp.org . This DNS name changes its IP addresses by round-robin every 150 seconds. Since my MRTG installations run every 5 minutes, I am getting a new IP address on every run. Of course, all destinations have different hop counts, mainly between 10 and 14 hops. However, some of these nodes do not reply to the ping requests, which results in values of “0” in the graphs. That is, the calculated average values do NOT make sense here.