Debarshi's den

GNOME Photos: an overview of zooming

with 2 comments

I was recently asked about how zooming works in GNOME Photos, and given that I spent an inordinate amount of time getting the details right, I thought I should write it down. Feel free to read and comment, or you can also happily ignore it.

Smooth zooming

One thing that I really wanted from the beginning was smooth zooming. When the user clicks one of the zoom buttons or presses a keyboard shortcut, the displayed image should smoothly flow in and out instead of jumping to the final zoom level — similar to the way the image smoothly shrinks in to make way for the palette when editing, and expands outwords once done. See this animated mock-up from Jimmac to get an idea.

For the zooming to be smooth, we need to generate a number of intermediate zoom levels to fill out the frames in the animation. We have to dish out something in the ballpark of sixty different levels every second to be perceived as smooth because that’s the rate at which most displays refresh their screens. This would have been easier with the 5 to 20 megapixel images generated by smart-phones and consumer-grade digital SLRs; but just because we want things to be sleek, it doesn’t mean we want to limit ourselves to the ordinary! There is high-end equipment out there producing images in excess of a hundred megapixels and we want to robustly handle those too.

Downscaling by large factors is tricky. When we are aiming to generate sixty frames per second, there’s less than 16.67 milliseconds for each intermediate zoom level. All we need is a slightly big zoom factor that stresses the CPU and main memory just enough to exceed our budget and break the animation. It’s a lot more likely to happen than a pathological case that crashes the process or brings the system to a halt.

Mipmaps to the rescue!

A 112.5 megapixel or 12500×9000 image being smoothly zoomed in and out on an Intel Kaby Lake i7 with a HiDPI display. At the given window size, the best fit zoom level is approximately 10%. On a LoDPI display it would’ve been 5%. Note that simultaneously encoding the screencast consumes enough extra resources to make it stutter a bit. That’s not the case otherwise.

Photos uses GEGL to deal with images, and image pixels are held in GeglBuffers. Each GeglBuffer implicitly supports 8 mipmap levels. In other words, a GeglBuffer not only has the image pixels at the original resolution, or level zero, at which they were fed into the buffer, but it also caches progressively lower resolution representations of it. For example, at 50% or level one, at 25% or level two, and so on.

This means that we never downscale by more than a factor of two during an animation. If we want to zoom an image down to 30%, we take the first mipmap level, which is already cached at 50%, and from there on it’s just another 60% to reach the originally intended zoom target of 30%. Knowing that we won’t ever have to downscale by more than a factor of two in a sensitive code path is a relief.

But that’s still not enough.

It doesn’t take long to realize that the user barely catches a fleeting glimpse of the intermediate zoom levels. So, we cut corners by using the fast but low quality nearest neighbour sampler for those; and only use a higher quality box or bilinear sampler, depending on the specific zoom level, for the final image that the user will actually see.

With this set-up in place, on the Intel Kaby Lake i7 machine used in the above video, it consistently takes less than 10 milliseconds for the intermediate frames, and less than 26 milliseconds for the final high quality frame. On an Intel Sandybridge i7 with a LoDPI display it takes less than 5 and 15 milliseconds respectively, because there are less pixels to pump. On average it’s a lot more faster than these worst case figures. You can measure for yourselves using the GNOME_PHOTOS_DEBUG environment variable.

A lot of the above was enabled by Øyvind Kolås’ work on GEGL. Donate to his fund-raiser if you want to see more of this.

There’s some work to do for the HiDPI case, but it’s already fast enough to be perceived as smooth by a human. Look at the PhotosImageView widget if you are further interested.

An elastic zoom gesture

While GTK already comes with a gesture for recognizing pinch-to-zoom, it doesn’t exactly match the way we handle keyboard, mouse and touch pad events for zooming. Specifically, I wanted the image to snap back to its best fit size if the user tried to downscale beyond it using a touch screen. You can’t do that with any other input device, so it makes sense that it shouldn’t be possible with a touch screen either. The rationale being that Photos is optimized for photographic content, which are best viewed at their best fit or natural sizes.

For this elastic behaviour to work, the semantics of how GtkGestureZoom calculates the zoom delta had to be reworked. Every time the direction of the fingers changed, the reference separation between the touch points relative to which the delta is computed must be reset to the current distance between them. Otherwise, if the fingers change direction after having moved past the snapping point, the image will abruptly jump instead of sticking to the fingers.

The image refuses to become smaller than the best fit zoom level and snaps back. Note that simultaneously encoding the screencast consumes enough extra resources to make it stutter a bit. That’s not the case otherwise.

With some help from Carlos Garnacho, we have a custom gesture that hooks into GtkGestureZoom’s begin and update signals to implement the above. The custom gesture is slightly awkward because GtkGestureZoom is a final class and can’t be derived, but it’s not too bad for a prototype. It’s called PhotosGestureZoom, in case you want to look it up.

The screencasts feature a 112.5 megapixel or 12500×9000 photo of hot air balloons at ClovisFest taken by Soulmates Photography / Daniel Street available under the Creative Commons Attribution-Share Alike 3.0 Unported license.

The touch points were recorded in an X session with a tool written by Carlos Garnacho.

Written by Debarshi Ray

8 February, 2019 at 18:36

Posted in C, Fedora, GEGL, GNOME, GTK+, Photos

2 Responses

Subscribe to comments with RSS.

  1. Cool! Thanks!

    Lapo Calamandrei

    8 February, 2019 at 20:28


Leave a comment