Kernel Dump Regression Testing

Student: Ritika Gupta (ritikagupta1998@gmail.com)
Mentor: Mark Johnston (<markj AT SPAMFREE freebsd DOT org>)

Project description

A Kernel Crash Dump refers to a portion of the contents of volatile memory (RAM) that is copied to disk whenever the execution of the kernel is disrupted. The following events can cause a kernel disruption :

Kernel Panic
Non Maskable Interrupts (NMI)
Machine Check Exceptions (MCE)
Hardware failure
Manual intervention

To find the cause of the kernel crash, an automated test suite needs to be written in FreeBSD. A fully automated test harness empowers other techniques that are impractical when testing manually, in order to make debugging and problem identification easier. There are many configuration variables that can affect the Kernel Dump i.e different code paths to produce a kernel dump as mentioned above, we can analyze by setting them up differently with the help of this test suite. The aim of the project is to create a framework for recovering and testing those kernel dumps for a different set of configuration variables or code paths while logging every test case separately in a log file and later analyzing which code paths failed.

Deliverables

Deliverables for first evaluation:
1. A module written in Python which would be able to take Hardware/VM configuration variables and Kernel Dump configuration variables as input and,
  - Generate VM configuration
  - Launch the VM
  - Configure kernel dumper
  - Debugging the kernel dump and verifying whether the dump is valid
Deliverables for second evaluation:
1. A comprehensive and automated test-suite that will be able to test the kernel dumps generated for a combination of configuration variables, both locally and over a network.
2. Configure a netdump server on the host machine or a second VM and start dumping over a network
3. Identify the set of test cases to be tested
Deliverables for last evaluation:
1. Carry out rigorous testing for every combination of the variables and log out the failures on a separate file.
2. Analyze the code paths which failed
3. If time permits, Python gdb script will be written that does a bunch of checks and verifies that it runs successfully.

Milestones

Start	End	Task
4 May	1 June	Get ready with my dedicated hardware for the project, interact with the FreeBSD community and try to get valuable inputs from the community
1 June	15 June	Start working on the module and expect it to be utilising bhyve for hosting a VM with necessary specifications
15 June	29 June	Start using pexpect to interact with the the VM’s console and providing the necessary values for the configuration variables
29 June	3 July	Improve the code and check for any backlogs, document everything
		Phase 1 Evaluation
3 July	10 July	Configuring a netdump server on the host machine and start dumping manually over a network
10 July	27 July	Bring everything together and be ready with the test suite and ensure that every combination can be tested. Ensure that any failure is logged in a separate log file
		Phase 2 Evaluation
31 July	24 Aug	Testing phase: Testing different combinations of the configuration variables with the help of atf(7) and kyua(1) and mapping the cases which do not produce a valid dump on primitive testing and log them out in a separate file. Analyzing the cases where the dump produced was not valid.
24 Aug	31 Aug	Refine the code to make it more presentable, add final touches to the documentation and prepare to submit the project
		Final Evaluation

Test Plan

Debugging Kernel Dump
The kgdb(1) utility is a debugger based on gdb that allows debugging of kernel core files.
Example -kgdb /boot/kernel/kernel /var/crash/vmcore.0
Kernel binary - /boot/kernel/kernel
Dump path - /var/crash/vmcore.0
If it is able to successfully open the dump, the dump is considered to be valid. And then we can backtrace by running “bt” to verify. This method for checking of the kernel dumps is considered sufficient for now. More memory based checks can be added later. The dump can be analyzed either in the VM machine or can be collected at the hsot and be analyzed at the host machine. Analyzing it in the VM seems easier.
About atf(7)
atf(7) is an automated testing framework in which tests can be written as shell scripts or as C code. In this project, most of the tests will be written in shell script since the atf-sh(3) version is best suited to test the entire application as compared to atf-c, which is mostly used when testing libraries of the kernel based APIs.The test program relies on the kyua(1) engine which is responsible for isolation of the test program from the rest of the system and for cleaning the effects of the test program. We can specify it as atf_set require.user root in the head function. The man atf-test-case(4) contains a complete list of such options. One of the most useful helper functions provided under atf(7) that will be used for checking is the atf-check. It executes a command and analyzes its results, including the exit code, stdout, stderr. This will help us in logging our test results. After the tests have been written, they can be built by creating a Makefile. This creates a Kyuafile in the directory which can be used to run our tests in a deterministic fashion.

The Code

https://github.com/ritika98/freebsd

Useful links

CategoryGsoc

SummerOfCode2020Projects/KernelDumpRegressionTesting (last edited 2020-05-15T01:48:49+0000 by MarkLinimon)