Current status: Completed. pkg_patch.tbz
Additional information:
Brainstorming for the pkg_patch project
Per SoC proposal comment, the scope of the project is reduced not to deal with the "ABI stability" aspect of the ports tree, though it will probably be necessary to make binary patches actually usable.
The project contains these general steps:
A tool to create and apply binary package diffs - the pkg_patch tool.
- An infrastructure to detect which packages installed on the local system need patching via the binary package patch mechanism - the update infrastructure.
pkg_patch
The pkg_patch is the central tool for creating and applying binary package patches. It creates binary patches to packages in which the files to be patched are stored either verbatim or bsdiffed.
In any case, the tool will patch files on the "live" system, in an atomic way (backup+copy+patch files+rename file), along with /var/db/pkg metadata.
Backed up packages will be stored in /var/backups/pkg.
Brainstorming:
- It will be built on top of pkg_* infrastructure
- It will operate on binary .tbz package archives (performance will improve when txz gets supported in base)
- Package patch filenames in the form php-5.2.10-patch-5.2.11.tbz ?
Patch files: binary format, with a small header and the rest of the data as a zlib stream; binary data encoded endian-neutral to allow for cross-builds of packages & patches; zlib will mostly compress metadata, it is expected that the patch data itself will be of low entropy
- Need to think about handling packages with different cwd...
The update infrastructure
The update infrastructure basically deals with maintaining an index (PKGPATCHINDEX) of created package patches (via the "mass package patch creation" feature), which clients use to determine which packages on their local systems need patching, and to provide these patches to the clients.
Depending on how the "ABI stability" aspects of the ports infrastructure is created (if any), the binary package diffs might include only the last versions from a known starting step (e.g. if php5.2.10 was in 8.0-RELEASE, then the patch repository will contain patches of php5.2.10 to 5.12.11, php5.2.10 to 5.2.12) or all the intermediate versions (e.g. php5.2.10 to php5.2.11, php5.2.11 to php5.2.12). As the ABI stability question is left dangling, this work will not enforce any such mode.
Brainstorming:
- Rely on simple text and HTTP protocols for distributing information about everything (list of updates, package patches)
- Maintain a list of patched packages with dates and versions to be compared with local state to determine what needs to be fetched and patched
The schedule
The total amount of time available for the work is 10 weeks, with the mid-term evaluation after the 6th week. The described project will proceed by the following steps:
- pkg_patch: Project start + patch generation step (for individual packages): 2 weeks
- pkg_patch: Patch application step: 2 weeks
- pkg_patch: wrapper or built-in function for generating patches to multiple packages: 1 week
- Update infrastructure: define the format of the list of updates, generate a test list: 1 week
- -- midterm --
- pkg_patch: wrapper or built-in function for fetching the list of packages needing the updates, fetching the patches and applying them: 4 weeks
Survey of other work
RPMs
PatchRPM - contains changed files within the package, part of the regular rpm toolset, used mostly for RedHat-derived distributions (?), same file format as rpm
- DeltaRPM - contains patches to changed files (binary diffs) within the package, written in python, used in SuSE (?)
DEBs
DebDiff - can contain both changed files and patches to changed files