One of the challenges with Y'CbCr (what Carmack is calling "YUV") is that there are so many flavors.
He mentions 4:2:0 chroma subsampling. But he doesn't mention chroma siting. Or alternative sumbsampling schemes. Or matrix coefficients. Or full-range vs video-range (a.k.a. JPEG vs MPEG range). Heck, how you even arrange subsampled data varies by system (many libraries like planar; Apple likes bi-planar; etc.).
I'd love to see more support for rendering subsampled Y'CbCr formats so you don't have to use so much RAM, but it gets complicated quick.
I don't think he's asking for video cards to natively handle JPEG images. ISTM what he is advocating for is keeping the JPEG decompressed in memory in a YUV format rather than RGB. The savings come from the fact that the UV parts are downsampled, not from any sort of compression.
So since you already have to process the image, it doesn't seem like a big ask to convert from JPEG-flavored YUV to GPU-flavored YUV. But I'm not an expert, so maybe this is hard/lossy?
I'm fully aware that he's not talking about decoding JPEG images, and instead is talking about keeping Y'CbCr formats in memory and rendering with Y'CbCr.
It's not lossy. But Y'CbCr -> RGB is not as simple as you might think. My whole post was about Y'CbCr -> RGB conversion. It's doable; I'm not saying it's impossible. There's just several various flavors of Y'CbCr and correctly handling all of them (or at least the majority/most common) gets tedious.
Why does the GPU need to handle more than one flavor of Y'CbCr? Why can't the jpeg/PNG/whatever decoder be relied on to convert whatever flavor it uses to the Direct X/OpenGL flavors?
And the pipeline is able to convert it to RGB fine, so isn't the information available to use? Why would it be a challenge passing along that information?
The information is there (sometimes... it's amazing how many times you have to make a "reasonable" guess when decoding images/video because metadata is missing). Passing it along is feasible. But it does complicate things (especially the implementation) pretty quickly.
A lot of Y'CbCr -> RGB converters actually disagree with each other. They're all close enough that casual users don't notice or care about the small discrepancies.
OBS uses NV12 variant of YUV by default for its rendering pipeline, which is supported as a GPU pixel format by Direct3D. OBS has written various shaders to convert into and out of NV12 on GPU.
Woah, easy there. I have no qualms with what John said. I don't expect someone to delve into all the complexities in Twitter. Overall I agree with John and I wish we had more ways to natively support Y'CbCr.
The point of my post was to help casual readers know that this gets complicated fast. Someone might think defining FMT_JPEG_YUV is easier and simpler than it actually is. I don't fault John for that. It's not a fault of anyone, really.
Oh neat, I didn't know about that one! I've never used Vulkan but looking at the API it does indeed cover the most common things. I wish we had more rendering APIs with that kind of support; I'd love to be able to use it.
However, the attractiveness of privatizing network effects and the simplicity of doing so pretty much killed any hope of blogs being a long term pragmatic solution for self publishing. Time to move on or at least give up until the underlying driving forces are changed.
He mentions 4:2:0 chroma subsampling. But he doesn't mention chroma siting. Or alternative sumbsampling schemes. Or matrix coefficients. Or full-range vs video-range (a.k.a. JPEG vs MPEG range). Heck, how you even arrange subsampled data varies by system (many libraries like planar; Apple likes bi-planar; etc.).
I'd love to see more support for rendering subsampled Y'CbCr formats so you don't have to use so much RAM, but it gets complicated quick.