/usr/share/check_mk/checks-man/diskstat is in check-mk-server 1.2.8p16-1ubuntu0.1.
This file is owned by root:root, with mode 0o644.
The actual contents of the file can be viewed below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 | title: Disk throughput
agents: linux, freebsd
catalog: os/storage
license: GPL
distribution: check_mk
description:
This check measures the throughput of block devices (disks) on Linux
hosts. You can either have a single check for every single disk
(which is the default) or a summary check summing up the throughput
of all disks.
For legacy reasons it is also possible (but not advisable) to have all disks
summarized but with a separate check for read and write (this is how this
check worked up to version 1.1.10).
The check also gives info on the IO latency and IOPS (unmerged) aquired
from the kernels information in /proc.
You can apply separate warning and critical levels for the read
and write throughput. Optionally you can have the check compute
average values on a configurable time period and have the levels
applied on the average instead of the current values. This makes
it possible to ignore short "peaks" and only trigger and longer
phases of high disk activity.
Averaging is not applied to IO latency calculations.
The check has to provide many ways of configuration for legacy reasons.
We strongly recommend you switch to the rule-based configuration, which
handles anything you want it to.
In single disk mode the check now also supports being used in a Check_MK
cluster environment. In this case it goes to CRIT if it finds more than
one disk item in the cluster. If it finds only one disk item, it also prints
the node name the disk was found on.
SUMMARY and legacy read/write modes are not supported in a cluster
configuration. The check goes to UNKNOWN and prints an appropriate message
if configured like that.
item:
Either {"SUMMARY"} for a summarized check of all disks or the name of the
disk device, e.g. {"sda"}. Additionally also one service per logical volume
defined in Linux LVM and Veritas VxVM on Linux.
In order to support configurations
up to version 1.1.10 also the items {"read"} and {"write"} are supported.
examples:
# switch inventory behaviour to 1.1.10 mode
diskstat_inventory_mode = "legacy"
# alternative: create one check for all disks
diskstat_inventory_mode = "summary"
# Set default levels for diskstat
diskstat_default_levels = {
"read" : (10, 20), # level for read MB/sec
"write" : (20, 40), # level for write MB/sec
"average" : 15, # averaging in minutes
}
# Alternative: just enable averaging over 10 minutes,
# do not apply levels:
diskstat_default_levels = {
"average" : 15
}
# Settings for certain hosts:
check_parameters += [
( {"write" : (20, 50), "average" : 10 }, [ "oracle" ], ALL_HOSTS, [ "Disk IO" ])
]
# New way:
# Enable and configure the inventory behaviour based with rule-based settings
diskstat_inventory_mode = "rule"
diskstat_inventory += [
( ['summary', 'lvm', 'vxvm'], [], ALL_HOSTS ),
]
# Then configure levels for 50/90MB/s read IO over 10 Minutes and a bit less for
# writes. Next put levels on IO latency exceeding 80ms / 160ms.
checkgroup_parameters.setdefault('disk_io', [])
checkgroup_parameters['disk_io'] += [
( {'read': (50.0, 90.0), 'write': (40.0, 60.0), 'average': 10, 'latency_perfdata': True, 'latency': (80.0, 160.0)}, [], ALL_HOSTS, ALL_SERVICES ),
]
perfdata:
The disk throughput for read and write in bytes per second. If averaging
is turned on, then two additional values are sent: the averaged read and
write throughput.
The IO latency is returned if {"latency_perfdata"} is set to True
In the legacy mode only one variable: the throughput since the last check
in in bytes(!) per second, either for read or for write.
inventory:
The inventory is configured via {diskstat_inventory_mode}. If this is set
to {"single"} (the default), then one check will be created for each
disk. If it is set to {"summary"} then only one service per host will be
created that has at least one hard disk. If set to {"legacy"} then a
separate check for read and write will be created (deprecated).
[parameters]
"read": The levels to be applied to the read throughput. It this entry is
{None} or missing in the dictionary, then no levels are applied. This is
the defaut. The values are in MB per second. Setting {"read"} to {(20, 40)}
will warn if 20 MB/s is exceeded and make the check critical at 40 MB/s.
If averaging is turned on, then the levels are applied to the averaged
values!
"write": The levels for the write throughput.
"average": If this is not {None}, it will be interpreted as a time range
in minutes to do averaging over. Set this to {15} in order to have
the levels applied to a 15-minute average instead of the current
values (approx.). Turning the average on will also create two additional
performance values. Make sure that your graphing tool is setup to
a changing number of variables.
[configuration]
diskstat_defaul_levels(dict): The default parameter used for inventorized
checks. This is preset to the empty dictionary.
That means that no averaging is done and no
levels are applied.
diskstat_inventory_mode(string): By default this is now set to {"rule"} for
fine-grained configuration of blockdevices to monitor.
The actual rule is defined in {"diststat_inventory"}.
The following older style parameters are also still available and mapped
to rules internally. Either {"single"} for one service per disk
or {"summary"} for the throughput of all disks summed up in one service.
Also possible is {"legacy"} for the old style mode (see above). Default
is {"summary"}.
diskstat_inventory(list): This is a list of block device types to track.
Possible values are {"summary"}, which creates a summary item showing the IO
rate of all disks, but not other block devices.
If you want statistics for every single disk, you can use {"phyiscal"}.
Enabling {"lvm"} will generate one item for each LVM volume, including snapshot
volumes.
Enabling {"vxvm"} will generate one item for each volume managed by VxVM,
including layered volumes.
|