Digitized files from TP2 go into long-term archiving

In cooperation with the Lower Saxony project “Landesinitiative Langzeitarchivierung” (LiLa) supervised by the TIB-Leibniz Information Center, digitized material from KOSTIMA Subproject 2 is currently being selected and prepared for long-term archiving (LZA). A number of technical aspects are being negotiated, two of which are presented here as examples.

  • File sizes: Only uncompressed material is suitable for the LTA, for audio material this means WAV or BWF/BWAV files in relatively high resolution (96kHz sampling rate, 32-bit word depth). One hour of tape material in this format generates around 2.6 gigabytes of data and is therefore not very suitable for data transfer to websites. Scans of paper documents in uncompressed TIFF format also quickly add up to 150 megabytes for a single A4 page.
  • Metadata: Descriptive data on the audio and image digitized data is stored in the fylr database in a nested database structure in which data records (e.g. an audio file) are linked to associated image data (photos of the audio tape) and their contents are described in as much detail as possible. Several audio tapes are usually combined into more comprehensive units. This metadata can be read by both humans and machines and is available in the so-called JSON format (Java Script Object Notation). Before long-term archiving, this metadata must be converted into a standardized format that can be permanently read by the long-term archiving software Rosetta from the manufacturer ExLibris.

These processes were initiated a few months ago and are being carried out by the TIB in cooperation with TP2 staff thanks to the expertise of the colleagues.

KOSTIMA IconMore Posts