/usr/lib/mon/mon.d/smtp3.monitor is in mon 1.2.0-4.
This file is owned by root:root, with mode 0o755.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 | #!/usr/bin/perl
# Yet another smtp monitor using IO::Socket with timing, logging
# This version looks deeper than the banner to catch milter and other problems
#
# $Id: smtp3.monitor,v 1.2.2.1 2007/06/03 20:07:16 trockij Exp $
#
# Copyright (C) 2001-2006, Jon Meek, meekj at ieee.org
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
=head1 NAME
B<smtp3.monitor> - smtp monitor for mon with timing, logging, optional MX lookup, and diagnostic capability.
=head1 DESCRIPTION
A SMTP monitor using IO::Socket with connection response timing and
optional logging. This test is reasonably complete. Following the
greeting banner from the SMTP server the monitor client issues the
HELO and MAIL commands then closes the session with a QUIT
command. Early versions of this monitor simply looked at the initial
greeting banner, but that did not detect certain temporary failure
conditions.
While configuring mon for this monitor keep in mind that a busy mail
server may reject new connections.
=head1 SYNOPSIS
B<smtp3.monitor> [-d] [-l log_file_YYYYMM.log] [--timeout timeout_seconds] [--alarmtime alarm_time] [--maxfailtime seconds] [--mx] [--esmtp] [--requiretls] [--nofail] [--from user@domain.com] [--to r1@d1.com,r2@d2.edu] [--size nnnnn] [--port nn] host host1 host2 ...
=head1 OPTIONS
=over 5
=item B<-d>
Debug/Diagnostic mode. Useful for manual command line use
for diagnosing mail delivery problems. To determine if a mail destination
will accept mail the --mx flag will useful.
=item B<--timeout timeout>
Connect timeout in seconds.
=item B<--alarmtime alarm_timeout>
Alarm if connect is successful but took longer than alarm_timeout
seconds.
=item B<--maxfailtime seconds>
Alarm if connect fails only if the response time is greater than this
value. If a Sendmail server is in REFUSE_LA, or similar, state due to
load it will usually reject the connection in a few milliseconds. A
typical value might be 0.050 for servers near the monitoring system.
=item B<-l log_file_template> /path/to/logs/smtp_YYYYMM.log
Current year & month are substituted for YYYYMM, that is the only
possible template at this time.
=item B<--mx>
Lookup the MX records for the domains/hosts and test them in
preference order. The first successful test will be considered a
success for that domain. This was originally devised for manual
command line use as a tool to verify that mail stuck in outbound
queues really can not be delivered. It could be used with mon as well,
however you are usually going to want to test ALL of your smtp
servers, not just be sure that one of them is OK. --mx applies to all
of the domains/hosts listed on the command line.
=item B<--esmtp>
Try ESMTP before SMTP.
=item B<--requiretls>
Check that STARTTLS is offered, fail if it is not. This option forces B<--esmtp>.
=item B<--nofail>
Never provide a failure return to mon. Useful in certain testing envrionments
when logging.
=item B<--port nnn>
Specify a port to use. Defaults to 25.
=back
=head1 MON CONFIGURATION EXAMPLE
hostgroup smtp mail1.mymails.org mail2.mymails.org
mail3.mymails.org
watch smtp
service smtp_check
interval 5m
monitor smtp3.monitor --timeout 70 --alarmtime 30 -l /n/na1/logs/wan/smtp_YYYYMM.log
period wd {Sun-Sat}
alert mail.alert meekj@mymails.org
alertevery 1h summary
=head1 LOG FILE FORMAT
A normal log entry has the format:
measurement_time smtp_host_name connect_time
A failed connection log entry contains:
measurement_time smtp_host_name connect_time smtp_code_and_greeting (or connect_error)
Where:
F<measurement_time> - Is the time of the connection attempt in seconds since 1970
F<smtp_host_name> - Is the name of the smtp server that was tested. If
--mx was selected then this field is servername=MX_record where
MX_record is the mail domain (host) from the command line.
F<connect_time> - Is the time from the connect request until the SMTP
greeting appeared in seconds with 100 microsecond resolution. If the
connection failed the time spent waiting for the connection will be a
negative number.
F<smtp_code_and_banner> - Should have the SMTP response code integer
followed by the greeting banner if there was a problem.
F<connect_error> - If present may indicate "Connect failed" meaning
that the connect attempt failed immediately, possibly due to a DNS
lookup error or because the server is not running any service on port
25. The field may also be "Connect timeout" indicating that the
connect failed after the set timeout period.
=head1 BUGS
It should be possible to specify --esmtp and --requiretls on a per-host basis.
A SMTP temporary failure code could cause the monitor to retry the connection
a certain number of times.
It is not yet possible to specify the username / domain for the HELO and
MAIL commands, but it would be very simple to add.
=head1 REQUIRED NON-STANDARD PERL MODULES
IO::Socket
Time::HiRes
Net::DNS (only if --mx option will be used)
If you do not have Time::HiRes you can choose to comment out the lines
that refer to F<gettimeofday> and F<tv_interval> but several features will be lost.
=head1 AUTHOR
Jon Meek, meekj at ieee.org
$Id: smtp3.monitor,v 1.2.2.1 2007/06/03 20:07:16 trockij Exp $
=cut
use English;
use Sys::Hostname;
use Getopt::Long;
use IO::Socket;
use Time::HiRes qw( gettimeofday tv_interval );
$RCSid = q{$Id: smtp3.monitor,v 1.2.2.1 2007/06/03 20:07:16 trockij Exp $ };
$ESMTP = 0;
$RequireTLS = 0;
GetOptions ('mx' => \$UseMX,
'd' => \$opt_d,
'esmtp' => \$ESMTP,
'requiretls' => \$RequireTLS,
'timeout=i' => \$TimeOut,
't=i' => \$TimeOut,
'alarmtime=i' => \$opt_T,
'maxfailtime=f' => \$MaxFailTime,
'T=i' => \$opt_T,
'logfile=s' => \$opt_l,
'l=s' => \$opt_l,
'nofail' => \$NoFail,
'size=i' => \$MessageSize,
'port=i' => \$Port,
'from=s' => \$FromAddress,
'to=s' => \$ToAddresses,
);
$ESMTP = 1 if $RequireTLS;
if ($UseMX) { # Will need Net::DNS Module, but don't require the module if it won't be used
eval "use Net::DNS";
do {
warn "Couldn't load Net::DNS: $@";
undef $UseMX;
} unless ($@ eq '');
$Resolver = new Net::DNS::Resolver;
}
$Port = 'smtp(25)' unless $Port;
$TimeOut = 30 unless $TimeOut; # Default timeout in seconds
$dt = 0; # Initialize connect time variable
@Failures = (); # Initialize failure list
$TimeOfDay = time; # Current time
print "TimeOfDay: $TimeOfDay\n" if $opt_d;
#
# Get the process username and the hostname of the monitor machine
#
$MonitorUsername = getpwuid($UID);
$MonitorHostname = hostname;
$host_address = gethostbyname($MonitorHostname);
$MonitorHostname = gethostbyaddr($host_address, AF_INET);
$FromAddress = qq{$MonitorUsername\@$MonitorHostname} unless $FromAddress;
print " From: $FromAddress\n" if $opt_d;
print " TimeOut: $TimeOut\n" if $opt_d;
#
# Check each host, or MX record
#
foreach $host (@ARGV) {
print "Check: $host\n" if $opt_d;
#
# Get the MX records, if we need them
#
if ($UseMX) {
undef %MXval;
undef @MXorder;
@mx = mx($Resolver, $host);
if (@mx) {
foreach $rr (@mx) {
$preference = $rr->preference;
$mxrecord = $rr->exchange;
$MXval{$mxrecord} = $preference;
}
} else {
print "can't find MX records for $host: ", $Resolver->errorstring, "\n" if $opt_d;
push(@Failures, $host); # Call it a failure
$FailureDetail{$host} = "Can't find MX records";
next;
}
#
# Sort the MX records into preference order
#
print "MX records for $host:\n" if $opt_d;
foreach $k (sort {$MXval{$a} <=> $MXval{$b}} keys %MXval) {
$Arecord = ''; # Clear for this MX
push(@MXorder, $k);
if ($opt_d) { # If in debug/verbose mode lookup A record
$name = $k . '.'; # Append dot for absolute lookup
if ($packet = $Resolver->search($name)) {
@answer = $packet->answer;
foreach $rr (@answer) {
$address = '';
$name = $rr->name;
$type = $rr->type;
$address = $rr->address if ($type eq 'A');
$Arecord .= "$type: $address "; # Append, in case some other records are found
}
} else {
$arecord = "Could not find A record for $name";
}
}
printf " %3d - %s %s\n", $MXval{$k}, $k, $Arecord if $opt_d;
}
}
#
# Now actually do the smtp check
#
if ($UseMX && @mx) { # Check MX records, stop after first success
foreach $mx (@MXorder) {
$HostPlusMX = "$host=$mx";
push(@HostNames, $HostPlusMX);
$TestTime{$HostPlusMX} = time;
print "Checking $HostPlusMX\n" if $opt_d;
$result = &CheckSMTP($HostPlusMX);
last if ($result);
}
} else { # Regular host check
push(@HostNames, $host);
$TestTime{$host} = time;
$result = &CheckSMTP($host);
}
}
if ($opt_d) {
foreach $host (sort @HostNames) {
print "$TestTime{$host} $host $ConnectTime{$host} $InitialBanner{$host}\n";
# ($shortfail, $rest) = split(/\n/, $InitialBanner{$host}, 2);
# print "$TestTime{$host} $host $ConnectTime{$host} $shortfail\n";
}
}
# Write results to logfile, if -l
if ($opt_l) {
# Determine logfile name, usually based on year/month
$LogFile = $opt_l;
($sec,$min,$hour,$mday,$Month,$Year,$wday,$yday,$isdst) =
localtime($TimeOfDay);
$Month++;
$Year += 1900;
$YYYYMM = sprintf('%04d%02d', $Year, $Month);
$LogFile =~ s/YYYYMM/$YYYYMM/; # Fill in current year and month
open(LOG, ">>$LogFile") || warn "$0 Can't open logfile: $LogFile\n";
foreach $host (sort @HostNames) {
$FailureDetail{$host} =~ s/\n/ /g; # Put it on one line, but result may be too long
$FailureDetail{$host} =~ s/ $//; # Trim final space
# ($shortfail, $rest) = split(/\n/, $FailureDetail{$host}, 2);
# print LOG "$TestTime{$host} $host $ConnectTime{$host} $shortfail\n";
print LOG "$TestTime{$host} $host $ConnectTime{$host} $FailureDetail{$host}\n";
}
close LOG;
}
if (@Failures == 0) { # Indicate "all OK" to mon
exit 0;
}
#
# Otherwise we have one or more failures
#
@SortedFailures = sort @Failures;
print "@SortedFailures\n";
foreach $host (@SortedFailures) {
print "$host $ConnectTime{$host} $FailureDetail{$host}\n";
}
print "\n";
exit 0 if $NoFail; # Never indicate failure if $NoFail is set
exit 1; # Indicate failure to mon
sub CheckSMTP {
my $host = shift;
my $t1, $t2, $dt, $mx_name, $stripped_host;
my $Failure = 0; # Flag to indicate failure for return code
# return 0 may not be working inside eval
my $buflength = 1024;
if ($host =~ /=/) { # Have MX data
($mx_name, $stripped_host) = split(/=/, $host);
} else {
$stripped_host = $host;
}
#
# Use eval/alarm to handle timeout
#
eval {
local $SIG{ALRM} = sub { die "timeout\n" }; # Alarm handler
alarm($TimeOut); # Do a SIG_ALRM in $TimeOut seconds
$t1 = [gettimeofday]; # Start connection timer, then connect
my $sock = IO::Socket::INET->new(PeerAddr => $stripped_host,
PeerPort => $Port,
Proto => 'tcp');
if (defined $sock) { # Connection succeded
$in = '';
$bytes = sysread($sock, $in, $buflength); # Handle multi-line banners
$InitialBanner{$host} = $in;
$t2 = [gettimeofday]; # Stop clock
print " Banner: $InitialBanner{$host}\n" if $opt_d;
if ($InitialBanner{$host} !~ /^220/) { # Consider "220 Service ready" to be only valid
push(@Failures, $host); # Note failure
if (length($InitialBanner{$host}) == 0) { # Note empty banner
$InitialBanner{$host} = 'null';
}
$FailureDetail{$host} = "BANNER: " . $InitialBanner{$host}; # Save failure banner
$ConnectTime{$host} = -1;
# last;
$Failure = 1;
print "QUIT\r\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
return 0;
}
if ($ESMTP) { # Try EHLO first
print "EHLO $MonitorHostname\r\n" if $opt_d;
print $sock "EHLO $MonitorHostname\r\n";
$in = '';
$bytes = sysread($sock, $in, $buflength); # Handle multi-line banners
$EhloResponse{$host} = $in;
print " EHLO resp: $EhloResponse{$host}\n" if $opt_d;
if ($EhloResponse{$host} !~ /^250/) { # Consider "250 Requested mail action okay, completed" to be only valid
push(@Failures, $host); # Note failure
print "EHLO Failure!\n" if $opt_d;
$FailureDetail{$host} = "EHLO: " . $EhloResponse{$host}; # Save failure banner
#last;
$Failure = 1;
print "QUIT\r\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
return 0 if $RequireESMTP;
}
if ($RequireTLS && ($EhloResponse{$host} !~ /STARTTLS/)){ # Check TLS advertisement
push(@Failures, $host); # Note failure
$FailureDetail{$host} = "STARTTLS Not Offered ";
print "STARTTLS Not Offered!\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
return 0;
}
}
if (!$ESMTP or ($ESMTP && $Failure)) {
print $sock "HELO $MonitorHostname\r\n";
$in = '';
$bytes = sysread($sock, $in, $buflength); # Handle multi-line banners
$HeloResponse{$host} = $in;
print " HELO resp: $HeloResponse{$host}\n" if $opt_d;
if ($HeloResponse{$host} !~ /^250/) { # Consider "250 Requested mail action okay, completed" to be only valid
push(@Failures, $host); # Note failure
print "HELO Failure!\n" if $opt_d;
$FailureDetail{$host} = "HELO: " . $HeloResponse{$host}; # Save failure banner
#last;
$Failure = 1;
print "QUIT\r\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
return 0;
}
}
$FromLine = qq{MAIL From:<$FromAddress>};
if ($MessageSize) {
$FromLine .= qq{ SIZE=$MessageSize};
}
$FromLine .= qq{\r\n};
print $FromLine if $opt_d;
print $sock $FromLine;
chomp($MailResponse{$host} = <$sock>);
print " MAIL resp: $MailResponse{$host}\n" if $opt_d;
if ($MailResponse{$host} !~ /^250\s+/) { # Consider "250 Requested mail action okay, completed" to be only valid
push(@Failures, $host); # Note failure
$FailureDetail{$host} = "MAIL: " . $MailResponse{$host}; # Save failure banner
#last;
$Failure = 1;
print "QUIT\r\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
return 0;
}
if ($ToAddresses) { # Addresses given on command line
(@to_addrs) = split(/,/, $ToAddresses);
foreach $to (@to_addrs) {
$RcptCommand = qq{RCPT TO:<$to>};
print "$RcptCommand\r\n" if $opt_d;
print $sock "$RcptCommand\r\n";
chomp($RcptResponse = <$sock>);
print " RCPT resp: $RcptResponse\n" if $opt_d;
}
}
print "QUIT\r\n" if $opt_d;
print $sock "QUIT\r\n"; # Shutdown connection
close $sock;
$dt = tv_interval ($t1, $t2); # Compute connection time
$ConnectTime{$host} = sprintf("%0.4f", $dt); # Format to 100us resolution
if ($opt_T) { # Check for slow response
if ($dt > $opt_T) {
push(@Failures, $host); # Call it a failure
$FailureDetail{$host} = "Slow Connect";
$Failure = 1;
return 0;
}
}
} else { # Connection failed
$t2 = [gettimeofday]; # Stop clock
$dt = tv_interval ($t1, $t2); # Compute connection time
$ConnectTime{$host} = sprintf("-%0.4f", $dt); # Format to 100us resolution, -val if failure
print " Connect to $host failed\n" if $opt_d;
if ($MaxFailTime) {
if ($dt <= $MaxFailTime) { # Don't alarm on connection refusals due to server load
$Failure = 0;
return 1;
}
}
push(@Failures, $host); # Save failed host
$FailureDetail{$host} = "Connect failed";
$Failure = 1;
return 0;
}
};
alarm(0); # Stop alarm countdown
if ($@ =~ /timeout/) { # Detect timeout failures
$t2 = [gettimeofday]; # Stop clock
$dt = tv_interval ($t1, $t2); # Compute connection time
$ConnectTime{$host} = sprintf("-%0.4f", $dt); # Format to 100us resolution, -val if timeout
push(@Failures, $host);
print " Connect to $host timed-out\n" if $opt_d;
$FailureDetail{$host} = "Connect timeout";
$Failure = 1;
return 0;
}
if ($Failure) { # Important when an MX record list is being checked
return 0;
} else {
return 1;
}
}
__END__
SMTP Reply Codes From RFC-821 - may use in the future
211 System status, or system help reply
214 Help message
[Information on how to use the receiver or the meaning of a
particular non-standard command; this reply is useful only
to the human user]
220 <domain> Service ready
221 <domain> Service closing transmission channel
250 Requested mail action okay, completed
251 User not local; will forward to <forward-path>
354 Start mail input; end with <CRLF>.<CRLF>
421 <domain> Service not available,
closing transmission channel
[This may be a reply to any command if the service knows it
must shut down]
450 Requested mail action not taken: mailbox unavailable
[E.g., mailbox busy]
451 Requested action aborted: local error in processing
452 Requested action not taken: insufficient system storage
500 Syntax error, command unrecognized
[This may include errors such as command line too long]
501 Syntax error in parameters or arguments
502 Command not implemented
503 Bad sequence of commands
504 Command parameter not implemented
550 Requested action not taken: mailbox unavailable
[E.g., mailbox not found, no access]
551 User not local; please try <forward-path>
552 Requested mail action aborted: exceeded storage allocation
553 Requested action not taken: mailbox name not allowed
[E.g., mailbox syntax incorrect]
554 Transaction failed
|