Federico Ponchio

research

coding

curriculum

life

Untrunc: Fix truncated mp4

Fix a truncated mp4: do it yourself. (source code and instructions HERE)

My Sangsung camera died while shooting the video of my marriage cerimony leaving a 600MB mp4 file which no player could read. The problem is that the codec information and frame indexes where missing at the end of the mp4. The whole moov section actually (as vlc points out and any hex heditor can confirm):

[00000417] mp4 demux error: MP4 plugin discarded (no moov box)

I could not find any software to find the problem, some arcane parameters for mp4box or ffmpeg might work, I was unable to find them

I wrote a small program using QT, ffmpeg and libfaad which rebuilds the index and, given a complete video as an example, recreates the moov atom, thus rendering the file playable again. The code runs under linux, but it should not be difficult to port to windows (or mac).

There is little hope it works for you as it is, but should not be too difficult to modify it to suit your needs (provided you can code in C++ and use an hexadecimal editor).
In any case you can contact me at ponchio@gmail.com (just not to complain about the code quality :)

Source code

The source code (width instructions) is available at the untrunc project in Github.

Older source code (not uptated anymore) is available here. I just recently rewrote the code to something decent.

Help / support

Did you recover your video? Help me make other people happy :)

In detail

Right now all you need is libavformat-dev and libavcodec-dev and libavutil-dev straight from the Ubuntu repositories.

The mp4 (mpeg-4, basically a MOV from Apple)format is roughly described on wiki.multimedia.cx and on Atomic Parsley documentation page. Atomic parsley is a software for tagging mp4, but has the nice feature to display the atom structure of an mp4 (option -T). Here is the output of a sane video:

Atom ftyp @ 0 of size: 32, ends @ 32
Atom free @ 32 of size: 32736, ends @ 32768
Atom mdat @ 32768 of size: 6799742, ends @ 6832510
Atom moov @ 6832510 of size: 5901, ends @ 6838411
     Atom mvhd @ 6832518 of size: 108, ends @ 6832626
     Atom udta @ 6832626 of size: 48, ends @ 6832674
         Atom INFO @ 6832634 of size: 40, ends @ 6832674
     Atom trak @ 6832674 of size: 3035, ends @ 6835709
         Atom tkhd @ 6832682 of size: 92, ends @ 6832774
         Atom mdia @ 6832774 of size: 2935, ends @ 6835709
             Atom mdhd @ 6832782 of size: 32, ends @ 6832814
             Atom hdlr @ 6832814 of size: 64, ends @ 6832878
             Atom minf @ 6832878 of size: 2831, ends @ 6835709
                 Atom vmhd @ 6832886 of size: 20, ends @ 6832906
                 Atom dinf @ 6832906 of size: 36, ends @ 6832942
                     Atom dref @ 6832914 of size: 28, ends @ 6832942
                 Atom stbl @ 6832942 of size: 2767, ends @ 6835709
                     Atom stsd @ 6832950 of size: 155, ends @ 6833105
                         Atom avc1 @ 6832966 of size: 139, ends @ 6833105
                             Atom avcC @ 6833052 of size: 33, ends @ 6833085
                             Atom btrt @ 6833085 of size: 20, ends @ 6833105
                     Atom stts @ 6833105 of size: 24, ends @ 6833129
                     Atom stsc @ 6833129 of size: 28, ends @ 6833157
                     Atom stsz @ 6833157 of size: 1220, ends @ 6834377
                     Atom stco @ 6834377 of size: 1216, ends @ 6835593
                     Atom stss @ 6835593 of size: 116, ends @ 6835709
     Atom trak @ 6835709 of size: 2702, ends @ 6838411
         Atom tkhd @ 6835717 of size: 92, ends @ 6835809
         Atom mdia @ 6835809 of size: 2602, ends @ 6838411
             Atom mdhd @ 6835817 of size: 32, ends @ 6835849
             Atom hdlr @ 6835849 of size: 64, ends @ 6835913
             Atom minf @ 6835913 of size: 2498, ends @ 6838411
                 Atom smhd @ 6835921 of size: 16, ends @ 6835937
                 Atom dinf @ 6835937 of size: 36, ends @ 6835973
                     Atom dref @ 6835945 of size: 28, ends @ 6835973
                 Atom stbl @ 6835973 of size: 2438, ends @ 6838411
                     Atom stsd @ 6835981 of size: 102, ends @ 6836083
                         Atom mp4a @ 6835997 of size: 86, ends @ 6836083
                             Atom esds @ 6836033 of size: 50, ends @ 6836083
                     Atom stts @ 6836083 of size: 24, ends @ 6836107
                     Atom stsc @ 6836107 of size: 28, ends @ 6836135
                     Atom stsz @ 6836135 of size: 1140, ends @ 6837275
                     Atom stco @ 6837275 of size: 1136, ends @ 6838411
Atom free @ 6838411 of size: 59635, ends @ 6898046

------------------------------------------------------
Total size: 6898046 bytes; 43 atoms total. AtomicParsley version: 0.9.0 (utf8)
Media data: 6799742 bytes; 98304 bytes all other atoms (1.425% atom overhead).
Total free atom space: 92371 bytes; 1.339% waste.
------------------------------------------------------

When the camera is turned of while registering (for whatever reason) the entire moov atom is missing. Fixing size of mdat and replacing the atom with that of another file does not work because

  • the durations are all wrong
  • the indices (stsz and stco atoms), used to locate audio and video packets are wrong
Some player (notably vlc) is able to rebuild the missing index provided it is able to guess the codecs, but since the codec information is in the moov section i had no luck. It might be possible to feed vlc only the codec info without the wrong indexes, (just some hex editor work) but I did not manage to. (by the way, i used bless as a hex editor, and found it good).

Rebuilding the indices involves guessing start and length of each packed in the mdat atom, and following the instruction in the links above, rebuild stco and stsz [notice how most atoms size needs to be recalculated when doing this]

I tryed some heuristics: AAC audio packets starts with 0x20 and ends with 111 followed by padding zeroes, while h264 video packets starts with 4 bytes indicating the lenght (most convenient) and than 0x41, but for large files there is little hope of never make a mistake.

Using ffmpeg and libfaad we can parse an audio packet:

int consumed = avcodec_decode_audio2(audio_c, outbuf, &outlen, (const uint8_t *)(stream.data() + offset), 0x320);
where 0x320 is just big enough that any audio packet will fit, and consumed returns the lenght of the packet. After that is only a matter of saving offsets, and fix durations.

The code is quite horrible but served his purpouse (recover my beloved file in a couple of days). If you need to adapt to your needs there are a few hacks you might want to be aware of:

  • The ffmpeg aac decoder did not work for me (didn't return the consumed correctly), so there while picking the aac code i pick the last one (which is the libfaad) like this:
      AVCodec *audio_codec = avcodec_find_decoder(CODEC_ID_AAC);
      AVCodec *codec = audio_codec->next;
      while(codec) {
        if(codec->id == audio_c->codec_id) 
          audio_codec = codec;
        codec = codec->next;
      }
    
    I am sure there is a better way...
  • For some misterious reason the duration is expressed as number of frames*timescale, and timescale is set as 90000. For the video. The audio the timescale is 96000. These numbers are hardcoded but could be read somewhere in the structure (might be different for different cams or settings)
  • The end of the mdat might contain junk, and currently this is not fully detected: the index creation routine will exit when it detects (heuristic) that the beginning of a frame is not as expected. But I bet there are situations i did not consider which will crash the program.
  • While parsing the mp4 i cheat, parsing only the atoms I need...

The code is released under GPL (QT, ffmpeg), I hope someone find it useful, drop me an email in case: ponchio@gmail.com