Advertisement

New Google AI Image Upscaling Makes Science-Fiction a Reality

September 2nd, 2021 Jump to Comment Section 9
New Google AI Image Upscaling Makes Science-Fiction a Reality

The “enhance” feature of sci-fi and police procedurals could soon become reality. Researchers from Google’s Brain Team have recently published an article titled “High Fidelity Image Generation Using Diffusion Models” on their Google AI Blog. With it, they claim to upscale images 4 to 8 times using image noise as a foundation to train neural networks. Let’s take a look at what the age of the geek has in store for us!

The Google AI Blog article, published by Research Scientist Jonathan Ho and Software Engineer Chitwan Saharia goes in-depth on their approach. While this concept was first introduced in 2015, it was shelved to give other ideas a chance to flourish. But like the famous tortoise from children’s books, slow and steady might have won the race. Conceptually, it’s no different from the superscale feature in DaVinci Resolve or Nvidia’s DLSS resolution enhancement. But what’s under the hood is a completely different story. While still in early development, this new AI photo upscaling approach can lead to incredible resources. Not only for filmmakers but photographers and game developers alike.

Talk nerdy to me

Let’s break down what the Brain Team has done in detail. Jonathan and Chitwan’s approach combined different training methods to work their magic. Initially, they used something called Super-Resolution via Repeated Refinement (SR3), a super-resolution diffusion model that builds high-resolution images from low-resolution inputs by using pure noise. This model is trained using an image corruption process.

AI Photo Upscaling From Noise - Before
AI Photo Upscaling From Noise – Before. Image Credit: Google AI Blog

How does that work? Well, the team took a high-resolution image and added noise until only pure noise remained. They then trained a neural network to reverse the process in order to recover the initial image. Finally, this trained neural network is used on images at 64×64 pixels to create superscaled versions. Jonathan and Chitwan then used this process in a stack to further improve the process. By stacking a 64×64 → 256×256 model with a 256×256 → 1024×1024 model, they were able to achieve impressive upscaling results.

AI Photo Upscaling From Noise - After
AI Photo Upscaling From Noise – After. Image Credit: Google AI Blog

Alchemy for the rest of us

But Jonathan and Chitwan weren’t finished yet and went beyond their initial SR3 process with Cascaded Diffusion Models (CDM). These models are trained on ImageNet data to generate high-resolution natural images. By using data from ImageNet, a large visual database designed for use in visual object recognition software research, the team was able to create an upscaling model that further enhanced their SR3 approach.

Upscaling with Cascaded Diffusion Models
Upscaling with Cascaded Diffusion Models. Image Credit: Google AI Blog

Real-World applications

By using SR3 and CMD models to upscale images, the Brain Team at Google has created a state-of-the-art approach to reduce the need for high-resolution images. Thankfully, this new AI photo upscaling process won’t affect Blackmagic Design or RED just yet. But it is amazing to see such technology on the horizon. In other words, don’t go selling your shiny new gear just yet.

This approach could be a perfect fit for security cameras. Also, it could be used in post-production pipelines for not only single images, but for video, and computer graphics. Nvidia is already doing this with their DLSS technology. Which can output 4K from a sub-HD quality render, reducing the strain on the GPU. Future cameras could record 1080p images with better color reproduction and dynamic range and then be upscaled to 4K in post. Even Netflix isn’t safe. Their 4K mandate could one day be a thing of the past. But that could be decades down the road. Until then, crank up that resolution and peep those pixels.

Still science fiction: This kind of stuff from “Enemy of the State” (1998).

What do you think of this research? How do you think it could alter technology in the future? Let us know in the comments!

9 Comments

Filter:
all
Sort by:
latest
Filter:
all
Sort by:
latest

Take part in the CineD community experience