A story: STARDIVA®
Jérôme Martinez
No Time to Wait 6, October 2022
A proprietary software was used for recording events with up to 8 simultaneous interpretations.
Content playable only on Windows by a modified version of VLC.
Metadata readable only on Windows by StarDiva tools.
VLC is open source. And copyleft.
So source code available, respecting the license?
Well... No.
Location, date, time, pauses, speaker.
Only one tool is able to read them.
No automation, no export, you are locked.
NSV (Nullsoft Streaming Video)
This format is documented, yeah!
Well... No.
NSV is 1 video and 1 audio only.
But we expect 8 audio tracks.
NSV has a spec for metadata.
But no NSV metadata there.
It is slow, no guaranty of result, but no other choice found.
Was looking easy for content:
AVC for video and AAC for audio.
Well... Was not so easy.
Was looking difficult for metadata:
completely opaque bytes.
Well... Was even more difficult than anticipated.
AVC at 25 fps in raw stream but stored as 23 fps in container.
We demux it and we fix the raw stream.
Key frame every 10 seconds.
But video stream can start without key frame.
Up to 10 first seconds of video are lost forever.
(and it is not worth it to retrieve some macroblocks)
AAC 8-channel with custom channel mapping and rare AAC features.
FFmpeg fails to decode them.
Fortunately FAAD does decode them
if we hack a bit the AAC bitstream (channel mapping)
if we discard buggy frames (else decoder stops).
2 possibilities:
- improve FFmpeg playback: awful hacks to plan for showing 8 tracks, support only on FFmpeg based players, AAC decoder to improve too.
- transcoding: more versatile, less work, but we have a reencoding, so some quality loss
Decision: decode, split 8-channel to 8-track, reencode.
Silent tracks are discarded
(Thanks to FFmpeg astats RMS level/peak).
Playable by any player supporting MKV+AVC+AAC.
We lose audio quality but mitigated by using FAAC HE-AAC for same bitrate.
Needed lot of work for guessing the logic of the tool.
Also needed to understand the bugs in the files :-D.
Not all is understood, but we have what we wanted.
Converted to Matroska chapters.
80% of files were well reverse engineered in 20% of the time spent for this project.
We had to process all files several times (during several days), for being sure that our algo is fine everywhere.
Never underestimate the time spent on corner cases.
Estimation of work effort is often based on the idea of having good files.
99% of files are usually good.
You don't know the quality of your files before you try to do QA on them.
Found ~1% of files with bad content.
These files were usually having 0.01% of missing or corrupted audio packets.
Such data is lost forever, replaced by silence.
Few files are totally undecodable, all is lost. Now you know.
MediaInfo: metadata readout, demux.
FAAD: audio decode.
FFmpeg: channel split, silent track detection.
FFmpeg+fdk_aac: HE-AAC encode.
MKVToolnix: mux with chapters, bitrate stats.
LeaveSD: dedicated command line for automation, fix of buggy packets, tweak of other tools command lines, creation of chapters XML, reporting.
It is not only about "just a quick transcoding".
Buying a proprietary software has a long term cost.
You have no idea about the quality of your files before you analyze them.
No time to wait for checking how are your files.
A developer indicates an high cost for such work?
Well, experience...
We want to share our experience: code is open source.
MediaArea: https://mediaarea.net, @MediaArea_net
LeaveSD page: https://MediaArea.net/LeaveSD
Jérôme Martinez: jerome@mediaarea.net
Slides: https://MediaArea.net/Events
License (except images): CC BY