Why Encoding For YouTube Sucks
Something I’ve been meaning to make a post about for a while, but have never really gotten around to doing until now.
First thing to explain.When you watch a YouTube video how you watch it determines how good the quality will be. For example if you watch a video on Google Chrome you will probably see a video encoded using the VP9 codec. If you watch a video on Mozilla Firefox you will probably see a video encoded using the H.264 codec.
The VP9 codec is solid and works great, and I have no complaints about it. The H.264 codec on the other hand can be very troublesome.
If you upload a video that is too detailed the video will have extremely inconsistent visuals. Too demonstrate this I used a 30 second clip from the video game BeamNG.drive.
I start off with a clip that is encoded in a lossless format that is in total 2.75GB. I then encode that lossless clip using ffmpeg using the crf 18 preset. This makes the video look nearly identical to the original clip, but it reduces the file size all the way down to 356MB. I then upload that video directly to YouTube. Once on YouTube some parts of the video will look fine, other parts will look significantly worse with the quality jumping all over the place around every 5 seconds.
Now for a comparison so you can see what I am on about. I found a frame where YouTube’s H.264 encode looks reasonable, and a frame where it looks ugly
Watching a video that fluctuates in quality this much is not a pleasant experience, so what is the solution? Encode the video at a lower bitrate so YouTube’s encoder has less details to work with. So for this next example I encoded the video with an average bitrate of 12000kbps, and uploaded that to YouTube. And here is what it looks like after being encoded by YouTube.
As you can see above the good looking screenshot looks a bit worse, but the bad looking screenshot looks significantly better. And when actually watching the video the fluctuations in quality are minor enough where it’s usually not distracting from the video.
I did a lot of test running videos from crf 18 to crf 36 and bitrates from 5000kbps to 20000kbps (at 1000kbps intervals) trying to find the highest quality video I could upload without causing dramatic fluctuations in quality. My conclusion was that crf encoding didn’t work well because in scenes where little movement was happening bitrate encoding looked better while looking about equal in scenes with a lot of movement.
This number of course is only relevant for this specific video game, any other type of video I do usually gets a couple of test encodes at high qualities and I slowly tone it down until the quality looks consistent enough to be acceptable.
And an unfortunate side-affect of reducing the quality of the uploaded video is that the VP9 encoded video’s visuals start to suffer because there is too little details present in the uploaded video.
It’s a really tough balancing act that you just can’t win.
I should point out most videos probably won’t have this issue. This game in particular though does a really swell job of showing the worst of YouTube’s H.264 encoder.
An alternative solution I previously toyed around with was adding motion blur to the video in post which helped create more consistent visuals at higher bitrates, but it just didn’t look “right” when you were watching it. I might run that test again just to get viewer feedback though since that was entirely personal opinion. Maybe it would be more “cinematic” and others would like it. I have no idea.
V2. Screenshots now work as expected.