Previous | Contents | Next

Chapter 7: Maintenance

There are few optional tasks that need to be executed by the administrator from time to time or during the initial configuration.

7.1 Cache cleanup

If a package is no longer downloadable by APT clients then its files are also not referenced in any index file and can be removed. This rule also applies to most volatile files from the distribution metadata. For example, Debian's Release file references some Packages and Sources files or Diff-Index file, and those do reference most other non-volatile files (binary packages, source packages, index diffs, ...).

7.1.1 Manual expiration

To run the cleanup action manually visit the report page in a browser and trigger the Expiration operation there.

There are different flags configuring the parameters of this tracking described below. Usually just the filename is sufficient to consider a file in the cache as a valid (downloadable) file. This is ok in most cases but sometimes leads to false positives, i.e. when another repository in the cache refers to a file with the same name but the reference to the original location is gone. On the other hand there can be cases where the assignment to different repositories happened by mistake and administrator would like to merge repositories later on.

For most files the checksum values are also provided in the index files and so the file contents can be validated as well. This requires reading of the whole cache archive to generate local checksums. It should also not be done when apt-cacher-ng is being used (file locking is not used here).

Usually it's necessary to bring various index files (Release,Sources,Packages,Index) in sync with the repository. This is necessary because apt works around the whole file download by fetching small patches for the original file, and this mode of operation is not supported yet by apt-cacher-ng (and might still be unreliable). When this synchronization fails, the index files might be incomplete or obsolete or damaged, and they might no longer contain references to some files in the cache. Abortion of the cleanup process is advisable in this case.

There is also a precaution mechanism designed to prevent the destruction of cache contents when some volatile index files have been lost temporarily. The results of cache examination are stored in a list with the date when the particular files became orphaned. The removals are only executed after few days (configurable, see configuration file) unless they are removed from this list in the meantime.

Parameters of Expiration:

Skip header checks
By default, header description file of every package is opened and checked for bad data and for obvious inconsistencies (like local file being larger than specified by server). Which means opening reading a few kilobytes from disk for almost every file in the cache, and slightly degrades performance of the process. This option skips that basic checks.
Stop cleanup on errors during index update step
Index files update is done first, on errors the expiration will be interrupted.
Validate by file name AND file directory
This option can be used to remove distribution stages. Example: to remove "oldstable" one just needs to delete the "Release" files in the cache and run Expiration with this option two times. There are some issues with this mode operation, see above for details.
Validate by file name AND file contents (through checksum)
Checking file size and contents where possible against the metadata in the index files. Note: future calls of Expiration process without this option will discard the results of this check and forget about corrupted files. Therefore, an action on this files needs to be done ASAP, like truncating them (see below) or removing via the removal operation (using the checkbox and the Delete button, see process output) or via the "Delete all unreferenced files" operation on the main control page.
Force the download of index files
Sometimes it may be needed to redownload all index files, explicitly replacing the cached versions. This flag enables this behaviour.
Purge unreferenced files after scan
Avoid the use of the orphan list and delete files instead. This option is dangerous and should not be used unless when absolutely no mistakes/problems can happen. Instead, it's possible to view the orphan list later and delete then (see control web interface).
Truncate damaged files immediately
If a file has been identified as damaged, it will be truncated (file size reset to 0). Setting this option is a good compromise for debugging purposes compared to the simple deletion since it will keep the header files on the disk, for further analysis of the problem's cause.
More verbosity
Shows more information, e.g. each scanned file when used with some of the other options. This might result in a very large HTML page, making the watching HTML browser very slow.

In additional to the default scan run, there are some "Direct Action" buttons in the Web frontend. It's possible to see the temporary list of files that have been identified as orphaned (unreferenced), and it's possible to delete all files from that list immediately. To be used carefully!

7.1.2 Automated cache cleanup

A script called expire-caller.pl is shipped with the package. This script effectively implements a HTTP client which operates like a human would do when running the expiration manually (see above). It can also extract the operator password and unix socket file path from the local configuration file. On Debian installations it is called by the file /etc/cron.daily/apt-cacher-ng so it should run automatically as daily cron task. The results are usually not reported unless an error occurs, in which case some hints are written to the standard error output (i.e. sent in cron mails).

The operator script can take some options from the environment, see below. The default operation mode is calling the expiration operation with default parameters and with credentials from local system's apt-cacher-ng installation. However, this can be changed with ACNGREQ variable.

DEBUG=1
If set to non-empty and not 0, the temporary HTML output is reported to the console. For debugging purposes only.
ACNGIP=10.0.1.3
The network address for remote connection may be guessed incorrectly by the operator script. This variable can specify an explicit target to connect to, e.g. the same IP as the one used by the clients (unless this network connection is somehow restricted in the local setup).
HOSTNAME=localOrPublicName
When an error occurs, the operator script most likely adds an URL to be opened for further investigation. The host name of in this URL can be customized, i.e. can be set to a public domain name representing the server as accessible from the administrator's machine.
ACNGREQ=cgiparameters
Override the auto-detected command parameters with a custom set. This is the part of a command URL from the management interface after the ? sign.

7.1.3 Keeping latest versions of expired package files

Sometimes it makes sense to keep a couple of versions of (Debian) packages even after they have been removed from remote source. It is possible to set an exceptional rule for package files which follow the naming and versioning scheme of .deb-packages. This extra handling is configured by the KeepExtraVersions options which tells how many of the top-latest versions shall be kept. The cache system needs the dpkg program and sufficient CPU power (depending on the option value).

7.2 Removal of distribution releases

Sometimes it's needed to remove all files from a distribution, i.e. when a new release became Stable and older package files are still lying around. In perfect conditions the reference tracking described above should take care of it and remove them soon.

However, this solution will fail if the release files are still available on the server AND apt-cacher-ng learned their real location (i.e. the code name instead of not the release state name) and so they are refreshed during regular expiration.

After all, if the old release is no longer used by local cache users then the extra disk usage becomes a problem. This problem will go away after many months when the old release files are finally deleted on the servers, then the package expiration will start complaining for some days (the expiration delay) and only then the finally unreferenced files will be removed.

To speed up this process, the local administrator can remove the traces of the old distribution release from the archive. Either the top-level "Release" files, or even the whole index file trees relevant for certain releases.

To make this task easier, a "brutal" script called distkill.pl is shipped with apt-cacher-ng. It runs interactively, it scans the package directory and presents an overview of index file trees assumed to represent distro releases. Then it provides a command promt to remove some immediately. The script should be used with extreme care! See section 8.2 for example of its output.


Comments to blade@debian.org
[Eduard Bloch, Sat, 08 Oct 2011 23:18:17 +0200]