RAWcooked
Data is never RAW.
Data is always cooked in a way or another.
Jérôme Martinez
iPRES, September 2019
MediaArea
Open source software company focused on digital media analysis. We work (different levels of involvement) on:
- MediaInfo
Convenient unified display of the most relevant technical and tag data for video and audio files
- MediaConch
Implementation checker, policy checker, & reporter
- QCTools
Helps users analyze and understand their digitized video files through use of audiovisual analytics and filtering
- BWF MetaEdit, AVI MetaEdit, MOV MetaEdit
Embedding, validating, and exporting of metadata
- DV Analyzer
Checking presence of technical errors in DV captures
Raw A/V files
Huge size
(4K+ is there, can be 100 MB/frame, several TB per hour)
1 file per video frame (thousands of files in a directory)
Not playable as is by several players (VLC...)
So many DPX or TIFF format flavors
(interoperability issues)
FFV1
Lossless video compression format
Open source, patent free
Adopted by several archives
Being standardized (IETF)
Frames are divided by slices, with checksums
Compression
Example with 1 second at 24 fps 10-bit HD film on a 6-core (12-thread) Skylake-X CPU:
- 24 DPX files (or in ZIP/TAR uncompressed): 189 MB
- 1 compressed ZIP file: 175 MB in 10 seconds
- 1 compressed LZMA2 file: 154 MB in 30 seconds
- 1 FFV1/MKV Intra 16-slice file: 105 MB in 1.5 seconds
Tests on SD 10-bit content
- Sponsored by VIAA
- On several TB of MXF/JP2k/PCM content
- Different sources and kinds of material
- E5-2698V3 (16 cores+HT, 2.3-3.6 GHz)
Tests on SD 10-bit content
Advantages
- Open source and free (FFmpeg)
- Package is 1.5-3x smaller than DPX/TIFF
- Cheksum by "Cluster" (usually 1 second) at container level
- Cheksum by "Slice" (you choose how many per frame) at video level
- Files are natively playable by several players (VLC, FFplay...)
Disadvantages of FFV1 alone
- You lose some metadata
(DPX/TIFF header: scan software, some colorimetry info, film type, DPX time code, shutter angle, gamma...)
- Complicated command line
- But...
RAWcooked
- Easy: just a short command line
"rawcooked YourDirectoryName"
- Store DPX/TIFF headers/footers in a specific Matroska attachment
- Store other sidecar files as Matroksa attachments
- Output is a single Matroska/FFV1/FLAC file
- Encoding is reversible (bit-by-bit to original files)
"rawcooked YourMatroskaFileName.mkv"
Easy check of integrity
- Check if the file is healthy
"rawcooked --check YourMatroskaFileName.mkv"
- Check if DPX headers are conform to specs
"rawcooked --conch YourMatroskaFileName.mkv"
- Add error correction codes while encoding the file
e.g. with overhead of 1.5%, you can lose 4 blocks every 252 blocks without losing any content
"rawcooked --ecc YourDirectoryName"
- Fix the corrupted file
"rawcooked --fix YourMatroskaFileName.mkv"
Use case
- Archive asks a digitilization to their supplier
Classic workflow with the scanner
+ "rawcooked --all YourDirectoryName"
- Transport... (2x less file sizes, less costly)
- Archive receives content & checks the integrity
(file health, DPX conformance...)
"rawcooked --check YourMatroskaFileName.mkv"
- Archive can visually check the content with
e.g. VLC Media Player
- Storage (cost divided by 2 due to compression)
- Revert to exact original DPX if someone needs it
"rawcooked YourMatroskaFileName.mkv"
Supported input formats
- DPX/Raw: 8/10/12/16 bit, RGB/RGBA
- TIFF/Raw: 16 bit, RGB
- WAV/PCM: 16/24 bit, 1/2/6 channel, 44/48/96 kHz
- AIFF/PCM: 16/24 bit, 1/2/6 channel, 44/48/96 kHz
- Based on files from our sponsors
- More formats or format flavors on request
Our sponsors
- AV Preservation by reto.ch (main sponsor)
- National Audiovisual Centre Luxembourg (CNA)
- National Library of Norway
- Irish Film Institute (IFI)
- Northwest University Library
- National Library of Wales
- Walter J. Brown Media Archives
- The MediaPreserve
- British Film Institute
Financial sustainability
- Open source code provided without lock to sponsors
- Deliveries on our website are with a lock
- DPX 8/10 bit RGB & WAV 2ch 48kHz flavors are usable by default
- We provide a key for other format flavors and features (temporary key possible)
- 1000 € for first flavor/feature
+ 500 € per additional flavor/feature
- To be compared with storage cost saving
(storage cost divided by 2)
Potential improvements
- Speed improvement (GPU/SSE/AVX)?
- Support of reels?
- CFA/Bayer/RGGB support?
- Graphical interface?
- More input formats?