Collecting Linux kernel crash core dumps with kdump¶
kdump is a Linux feature that enables automatically creating core dumps at the time the kernel crashes. The core dump includes the entire memory of the system at the time of the crash and can be used in diagnosing and troubleshooting kernel bugs. It is not a feature most users need.
Requirements¶
The two most recent core dumps are kept on the filesystem, each of which
can be as large as the system memory. Therefore, in order to reliably
collect core dumps, at least three times the total system memory should
be available in /var/crash. Furthermore, if you choose to install
the kernel debug packages when enabling kdump, as described in the next
section, you should expect to need as much space as required for however
many kernel versions you will keep installed.
Usage¶
Two new scripts have been introduced as part of the bonding package in
6.1 – /usr/share/bonding/enable-kdump and
/usr/share/bonding/disable-kdump – which enable and disable kdump,
respectively.
The enable-kdump script will install the required packages and configure
the system so that kdump is enabled next time the kernel boots. It
modifies /etc/default/kdump-tools and /etc/default/grub and will
prompt you to confirm the changes to grub. It will also prompt you about
installing the kernel debug package – which is required to be present to
troubleshoot a core dump. The debug package is relatively large when
installed; more than 3 GB.
After enabling or disabling kdump, a reboot is required before any changes take effect.
If the kernel crashes, a core dump is written under the /var/crash
directory. Afterwards, the machine should reboot.
On bootup, all but the two most recent dumps are removed from
/var/crash.
Verify that kexec is enabled and loaded¶
We can ensure that kdump is in use by reading the contents of
/sys/kernel/kexec_crash_loaded
# cat /sys/kernel/kexec_crash_loaded
1
Provoking a kernel crash¶
This is not a necessary step for enabling kdump; this is only for testing and should not be attempted while customers are using this node.
We can prompt a kernel crash, which will cause the system to crash. If kdump is enabled, it will create a core dump before rebooting.
Note
In the event of a kernel crash, induced or otherwise, every service on the machine will stop and cease to work until the core dump is created and the machine reboots.
To crash the kernel; execute the following:
echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger
If kdump is loaded, a core dump should be written to the /var/crash
directory. We should be able to verify that after the machine reboots.
# ls -lh /var/crash/
total 8.0K
drwxr-xr-x 2 root root 4.0K May 18 00:46 201705180045
-rw-r--r-- 1 root root 306 May 18 00:46 kexec_cmd