random thoughts

View My GitHub Profile

7 July 2020

Zero copy RPM updates

by

Intro

A basic building block of (open)SUSE’s operating systems are RPM packages. The methods to install and update those packages were perfected over two decades and are considered mature. With todays copy-on-write file systems, their snapshotting capabilities and ubiquitous internet resp network access there is potential to speed things up by rethinking and tweaking some of those methods.

Goals

For installing and updating packages this proposal wants to have

At the same time data needs to be deliverable via HTTP with no software on special server. This is required to be able to leverage the existing CDN resp mirror infrastructure.

Current State

Storage on server side

Each new revision of an RPM package is stored as individual file on server side. The payload of RPM packages is compressed.

That means

Delta RPMs

In updated RPM packages, most of the data in the payload actually does not change. For that reason delta RPMs were invented. The build system precomputes the differences between two given versions of a package to produce a binary delta. The binary delta can then be used to reconstruct the original RPM by taking data from the installed system. The result is an RPM that is 100% identical the the full download. That RPM can then be processed by the regular rpm tool.

That means

So in other words. The build system takes file system content, compresses it to produce an rpm package. The makedeltarpm tool takes two of those, decompresses them, compares the content and produces a compressed delta rpm. The applydeltarpm tool then takes the delta rpm, decompresses it, applies the delta information with information based on files from the system. The so created payload is compressed again to write an rpm package. Then upon actual installation the rpm command decompresses the rpm to write the content to the file system. So payload gets compressed three times and decompressed three times.

RPM installation

In order to install or update an RPM package, the target system has to download the RPMs in question. In order to prevent inconsistencies due to unreliable internet connections usually all packages are downloaded prior to installing them.

Installing packages means the rpm tool decompresses the payload and copies the data of the files parallel to the already existing ones. After all files are placed on disk, they are renamed to their final name.

That means:

Ideas

Optimizing Server Side Storage

RPM payload could be seen as a serialized representation of a file system tree:

rpm

Every such represenation is stored seperately. Even if the data of one file just changed slightly, the whole payload would have to be stored:

rpm2

Instead of looking at the payload as a concatenation of files, one could also look at it as a concatenation of data chunks. Instead of a file name those chunks could be identified by a hash of the data they contain. So the payload would turn into an index of data chunks.

rpm2

That’s what casync does. It uses content defined chunking to determine the chunks dynamically.

That means if RPMs had uncompressed payload, one could actually store only the data chunks, plus an index that tells casync how to stitch the chunks together to recreate the original rpm. So when the server has to keep several versions of an only slightly changed package this method would save space.

Storing chunks in separate files still allows to fetch them via HTTP GET request and allows mirroring of chunks.

Reassembling an RPM on client side

With the aforementioned storage on the server, a client could download the index file, followed by all required data chunks to recreate the rpm locally. Since most files didn’t change a lot anyways, the majority of data is basically already available locally. So the client can borrow those chunks without having to download them. For this method it does not matter if a non-config file got modified locally behind rpm’s back. So the delta download would be dynamic based on what data actually changed locally.

rpm-rsyncfromserver

At this point the client would have a regular RPM file that could be installed using rpm as usual. The download handling could be implemented in the layer on top ie zypp or dnf.

A slight improvement to reduce disk writes could be made using the %_minimize_writes setting.

On filesystems that support it, reflinks1 could be used for preparing the new rpm payload without actually copy data.

Tuning rpm

Installing an rpm so far would still be the traditional way, ie at least changed files would have to be copied.

On filesystems that support it, rpm could be extended to use reflinks1 preparing the new rpm payload without actually copying data out of the rpm file. So basically the reverse operation of assembling the new rpm.

rpm-install1

Dealing with the RPM header.

The RPM header can be quite big and wasn’t discussed so far. Normally the rpm header of installed packages is added to a database. Means it’s copied also. Database APIs abstract away where the actual data is stored, so using the reflink method won’t work to avoid duplication.

However, the rpm header could simply be considered part of the installed files of a package. For example by explicitly or implicitly having a %ghost entry for the header in the file list. Eg /usr/lib/sysimage/rpm/%{NEVRA}. The RPM package installation could install the header of a package there. So presence of the header would tell that a package is installed.

A side effect of that would be that no “database” is needed anymore. The canonical source of truth would be installed headers. For efficient rpm queries etc still some cached indexes are required.

Package manager integration

zypp-integration

Assuming the package manager is zypp, it would create the to-be-installed rpm packages in /var/cache/zypp and then reflink the content into the system. However, since /var/cache is normally a separate subvolume, it would also be a mountpoint. Therefore reflinks can’t cross mountpoints.

To make that work, there are two options:

  1. use a temporary mount of the top level subvolume and operate on it’s subdirectories to avoid crossing mount point boundaries.
  2. re-create the rpm directly in /usr/lib/sysimage/rpm/. So not only the header would be stored there but actually the full content.

Option 2) seems more straight forward. Also, all of the system would be self contained in /usr.

one-usr-tree.png

Now with this solution everything is in one place. The whole system would unfold from /usr/lib/sysimage/rpm/. The data would have to written to disk only once, installation is just metadata updates. All in-place updates are actually safe on a transactional system where modifications are applied in a CoW snapshot.

Footnotes

tags: