Posted July 21, 2021
low rated


I was trying to find a video about A.I generating realistic images from a tunnel vision source of a car driving through canyons to make a point but I couldn't find it.

Edit: Forgot the link. https://www.youtube.com/watch?v=uZt8sVvYQVg
First thing i notice is that frames 555 to 574 show quite a bit of blurring between the lip and the chin. Meanwhile, the parts that aren't moving (or moving much) are incredibly crisp. 589 to 630 shows it really well on the forehead. Now to understand what's going on here, there's 2 things you need to know: video compression is about descaling sections of images, rather than upscaling images. Secondly, there's more than one video compressor being used in this video.
The way a video compressor generally works comes down to focusing on several "key frames" then the frames after it, rather than being full frames, only contain the bits that change. I don't happen to have a tool handy that i know how to use to show this clearly, but i think you can see this very plainly, yourself. If you were to look at the "LOKI" to his right (your left), you'll notice it doesn't move. Neither does the wall, the dirt on the wall, the CB, the disney plus logo, etc. Since they don't move, and since the non-moving pixels make up the majority of frames, what typically happens is that you have a "key frame" which contains a full frame, then several frames after that frame (until the next key frame) will actually only contain the data for that which has changed from the previous frame, because over 50% of the data from frame to frame is not changing, so if you can keep it that way, on frame data alone you can cut the size between key frames in half just by reducing key frames. The next bit is, well, cutting down on the information between those key frames. Not only do you have a recording framerate issue (bluring will happen as a result of movment, but since certain parts aren't moving, they're clear), but you can further cut down on the size of of the frames outside of the key frames by using really, really lossy compression (blurring) to make it so the colors in the blurry bits are more similar, thus easier to compress (saying something like 88 99 72 90 27 90 19 40 17 is bulky, but if you can turn that to say 87 87 87 87 87 87 87 87 87 87 by blurring, you can then say 87 x 10 and it takes much, much less space [simplification, as we have more complex algorithms than simply repeating a single byte]). Since our eyes are already used to it, it's not as noticeable.
Now, here's the rub. The ZOOM and Skype video codecs tend to be really, really lossy, and that's the first filter you see. Then, on top of it, we have whatever filter is being used for OBS or whatever the interviewer is using to record the screen. Then, what's more, is the video editing program that she's using is going to do this, yet again. Now, since I downloaded it, I can see what the metadata of the file is. Which says... "handler_name : ISO Media file produced by Google Inc." which, in turn, tells me that Google threw another compression algorithm on top of the 3 already being used. This is also assuming the camera did no compression of it's own. What you see is weird is that the compressors didn't blur anything other than the movement (the things outside of the key frames) so the whole scenery and even parts of him that barely move look really, really crisp, but he, himself, looks like a smudge, and given he's the focal point, not the scenery, this looks really, really uncanny to you.
Post edited July 21, 2021 by kohlrak