FFV1 & RAWcooked
Data is never RAW.
Data is always cooked in a way or another.
No Time to Wait 5, December 2021
Lossless video compression format
Open source, patent free
Adopted by several archives
Frames are divided by slices, with checksums
Example with 1 second at 24 fps 10-bit HD film on a 6-core (12-thread) Skylake-X CPU:
Mainly sponsored by the PREFORMA project (2015-2017)
Review of the main (FFmpeg) encoder/decoder
Review and improvement of the pre-existing FFV1 spec draft
It is a slow process
Different people involved, lack of time
Hobby / spare time for lot of people involved
IETF (the standardization body) was very supportive
It is very useful!
Some bugs discovered while reviewing the code
Some clarification for corner cases
Derek Buitenhuis wrote a FFV1 decoder "for fun" (Great! Extra point of view on specs)
FFV1 is standardized since 2021 (IETF RFC 9043)
Big gap between the main sponsorship and the standardization!
Well... No Time To Wait?
Talk with Reto Kromer about a missing piece between archive constraints and a good storage
Source DPX are often required (technical or legal constraints)
But not optimal for storage (thousands of files! Not compressed!)
Reto offered the sponsorship for a proof of concept
Well... No Time To Wait again?
Several other archives learned about the project and well, storage is costly...
These archives sponsor ($ and time) the improvements in RAWcooked
No initial sponsorship?
No further sponsorship?
--> Stronger together!
One issue was that managers don't like to pay for something freely available
We wanted to keep the project open source
We chose an intermediate solution: open source with locked binaries
Advantages of open source are still there (you can fork if you don't like our work)
DPX may have some padding bits
They are expected to be 0 (it is just for 32-bit boundaries)
FFmpeg (the encoding library) legitimely ignores them
We were also ignoring them when storing reversibility data (focus on DPX header)
If padding bits are not 0, the reversibility promise is broken
Without users doing lot of tests, we would have missed that some scanners fill padding bits with something
--> Tests by users are important
--> Never bet on the value of an input byte, there will always be someone who decides to do something you didn't expected with it
RAWcooked uses by default Matroska attachments for storing reversibility data
Usually it is very small content (few KB of DPX header, compressed), but sometimes (0.0001% of processed content?) it becomes big
... and we didn't test that
FFmpeg was having a bug with attachments >= 1 GiB --> attachments reduced to 1 GiB, breaking the reversibility
--> Test reversibility of ALL files is actually needed
We added an automatic check in RAWcooked
It is 2x slower (decode of the compressed data, read again of DPX)
But we were not conservative enough and our promise was not true for 100% of created files
--> Speed should not be prioritized over check that all is fine
--> Next version of RAWcooked may have reversibility check enabled by default
FFmpeg implemented a check of coherency of input
One of theses checks is that an attachment should not be >= 256 MiB
Well, it is legitimate (avoiding to wait for the network in case of false-positive probing of the format)
... But it breaks the playback of some RAWcooked files
--> Matroska attachments were helpful for a first implementation, now moving to append data to the end of the file (and we keep backward compatibility)
Jérôme Martinez: firstname.lastname@example.org
License (except images): CC BY