rcorder
Contents
An experimental enhancement to the rcorder utility and associated changes to /etc/rc to leverage this feature to reduce startup times (and possibly suspend / resume if /etc/rc.resume is included in scope).
Origins
Luke Mewburn's original proposal for /etc/rc in 1999.
https://groups.google.com/g/mailing.netbsd.tech.userlevel/c/O4GxRawRPAw/m/eOuO3C1Hi7EJ?pli=1
http://www.mewburn.net/luke/papers/rc.d.pdf
Other Attempts
https://github.com/kil/rcorder
https://reviews.freebsd.org/D2102
https://github.com/buganini/rcexecr
https://reviews.freebsd.org/D3715
https://github.com/ultijlam/rcorder.sh
https://github.com/ngie-eign/rcorder3
This last one - authored by Boris Lytochkin - has been committed.
https://reviews.freebsd.org/D25389
Summary - the above uses "-p" to enable concurrency, is based on the original rcorder, and does not have any unit tests. The rc script edits below - with a minor change - could easily work with this new D25389 rcorder.
Summary
The enhancement is a new option "-p". When set, rcorder will return more than one RC script path per line. If /etc/rc is modified to pass this option, it must also be modified to parse the result to accept multiple script paths per line. The standard /etc/rc script does not support this.
Three or four rc scripts contain code similar to the following:
files=`rcorder /etc/rc.d/* ${local_rc} 2>/dev/null` for _rc_elem in $files; do run_rc_script ${_rc_elem} ${_boot} done
${_rc_elem} is the full path to the daemon script (usually found under /etc/rc.d/* or /usr/local/etc/rc.d/*). The script path is passed to run_rc_script which is described in rc.subr.
The for-loop is modified like so:
files=`rcorder /etc/rc.d/* ${local_rc} ${rc_parallel} 2>/dev/null` IFS=$'\n' for _rc_group in $files; do IFS=$' ' for _rc_elem in $_rc_group; do run_rc_script ${_rc_elem} ${_boot} & done wait IFS=$'\n' done
This also works with the D25389 rcorder (see link above).
The run_rc_script line is now run as a background task. The wait command waits for the group of tasks to complete before continuing.
If rc_parallel_start is not specified in /etc/rc.conf, the script defaults to running RC scripts one-at-a-time. Example:
rc_parallel_start="YES"
Test Run
The changes were tested on modest hardware (Core 2 Duo with spinning disk). A system booting from an SSD or diskless may give wildly different results.
The table below shows that enabling parallelism offers some savings only when there are more services to start.
/etc/rc.conf |
rc_parallel_start=NO |
rc_parallel_start=YES |
Savings |
|
system with minimal services |
8 |
8 |
0 |
seconds |
system with 1 extra service |
13 |
13 |
0 |
seconds |
system with 2 extra services |
17 |
13 |
5 |
seconds |
system with 3 extra services |
23 |
13 |
10 |
seconds |
system with 4 extra services |
28 |
13 |
15 |
seconds |
system with 5 extra services |
33 |
13 |
20 |
seconds |
The "extra service" is a placebo script that does nothing but sleep for five seconds - a way to simulate a busy service startup.
Savings reported here were 0 to 20 seconds. Notice there was no time savings until more than one service was added. Your system's benefit will vary.
Not bad for a relatively small change.
Test Plan
The above tests validate the feature as a proof of concept. What follows is plan for a real world test of two physical x64 systems running 12.2-RELEASE with patched /etc/rc and rcorder from 13.0-BETAx (soon to be -RELEASE).
asterisk |
mongodb |
nginx |
openldap |
postgres |
salt-minion |
sendmail (base) |
A salt-master monitors both machines to validate success.
Acceptance criteria:
1. Compare /var/log/message from the two machines for equivalent output with no errors. Expect the log output will be in different order. What's important is each daemon starts successfully.
2. Each daemon answers to a client. A client API error is actually a success for us (daemon works as designed).
I'm interested in the boot phase so configuration on each of these services will be minimal.
Code
Note: this patch is now in CURRENT along with a critical bug fix. Thanks to Cy Shubert for fixing this so quickly.
Edit: the patch was later backed out after more problems were reported. This little project is now on hold until I can understand how and why. Bourne shell scripting looks easy - until it isn't.
Build
The new rcorder is available in CURRENT. For 12.X-RELEASE, you can copy the rcorder source files to your /usr/src tree and build only rcorder.
To deploy rcorder on you system, copy the rcorder binary to /sbin/ - AFTER you have made a backup of the standard rcorder.
Deploy
To test rcorder, try it with and without the "-p" option:
$ rcorder /etc/rc.d/* $ rcorder -p /etc/rc.d/*
To deploy the RC scripts, make a backup of /etc/rc (for example, copy it as /etc/rc.original) and apply the patch to /etc/rc.
To verify, reboot. Your system should start up as before - no faster or slower.
To turn on concurrent tasking, add an entry to /etc/rc.conf:
rc_parallel_start=YES
Reboot again. This time start up will be a bit faster than before.
Limitations
This won't make jails start concurrently. You can use rc.conf(5) variable jail_parallel_start="YES" to enable concurrent jail startup.
Reference
https://en.m.wikipedia.org/wiki/Topological_sorting