We’re used to hearing about how applications such as Flickr and Google Earth are providing revolutionary new ways of looking at digital images.
But a technology development announced by Microsoft at the ACM Siggraph conference (the annual conference of the Special Interest Group of the Association for Computing Machinery) in August looks like scoring the prize for the most innovative recent development in digital image software.
Photosynth takes a collection of geographically related images and arranges them in a 3D-modelled space so you can navigate through them.
It is, appropriately enough, a synthesis of three software technologies that provides a new kind of environment for browsing photos. Those technologies are image-based modelling, image-based rendering and image browsing.
To put it another, perhaps slightly glib way, Microsoft has rolled together technologies from computer gaming, panorama stitching and photo organising to create an entirely new and original way of looking at digital photos.
Photosynth doesn’t seek to produce a seamless, technically perfect 360º panoramic vista of a scene in the way that panorama stitching software such as Realviz Stitcher, or Pano Tools does. Instead, it positions individual images within a 3D model that allows you to navigate between them and take a closer look at whatever interests you.
The software engineers behind Photosynth are Noah Snavely and Steven M Seitz of the University of Washington’s Graphics and Imaging Laboratory (Grail) and Microsoft’s Richard Szeliski.
In their paper ‘Photo Tourism: Exploring Photo Collections in 3D’, they explain that the object is “not to synthesise a photo-realistic view of the world from all viewpoints per se, but to browse a specific collection of photographs in a 3D spatial context.”
How does it work?
Like panoramic photography, Photosynth computes the location, orientation and
geometry of images in a scene by comparing matching features in pairs of
overlapping images. It even uses the same Scale-Invariant Feature Transform
(SIFT) algorithm that is used in some panorama stitching software.
Next comes an optimisation process that maps the position of each image relative to its neighbour, starting with a pair of images and incrementally adding more images and re-running the optimisation algorithm.
The final step is to align the model with a geo-referenced image, a satellite map for example, or a digital elevation map such as those used by Google Earth.
In one sense, at least, Photosynth’s job is easier than a panorama stitcher’s, because it doesn’t have to produce an exact seamless match. In panorama stitching, however, a lot of the variables are eliminated by using a known camera and lens combination and by precisely controlling the movement of the camera between shots.
The material Photosynth has to work with will have been shot handheld on anything from a digital SLR with a telephoto lens to a cameraphone. Matching and accurately positioning these images is a vast computational undertaking.
As you’d expect, this is not a process that happens in real time, or anything like it. The optimisation takes the bulk of the time as it involves multiple iterations which slows down with the addition of each new image and as more images share matching points.
In tests using images of a section of the Great Wall of China, shot with the same camera and lens over a short period of time, the render time for a set of 120 photos, of which 82 were registered (that is, the software was able to process them) was several hours. A set of 2,635 ‘uncontrolled’ images obtained from Flickr (of which 597 were registered) took several days.
Now watch the demo
Although there is currently no Photosynth application available, you can view a
live
demo Java applet of the Washington Grail research group’s Photo Tourism
applet, on which Photosynth is based.
The demo displays the 3D space as a ‘point cloud’ with the image frusta overlayed. And if you’re wondering what a frustum is, in this case, they are 3D pyramid shapes which indicate the position within the 3D space of cameras, the direction they are pointed in and their angle of view.
Clicking on any one of the cameras displays the image at that location. Just as interesting, if not more so, than the final view is the journey, which flies you smoothly through the 3D model, passing other cameras on the way. Transitions from one camera to another are very slick, incorporating smooth movement through the 3D space, as well as a dissolve.
A step-back button does just that, depicting a wider field of view from which you can select alternative cameras. The Photosynth application will feature ‘geometric browsing tools’ which will allow you to move left and right and to view parts of the scene at different scales.
There will also be a ‘similar’ button that will display alternative images of the same scene, for example at different times of the day or year, or even over longer time periods, enabling historical comparisons.
Photosynth’s zooming and its ability to display high-resolution detail, will surpass conventional pixel-based viewers. The multi-image composition of scenes makes it possible to drill down to fine detail using new images, rather than enlarging existing ones, until the pixels look like breeze blocks.
This process can happen in real time even on narrowband connections due to a technology called Seadragon acquired by Microsoft in February this year.
All Online Tags: Digital Imaging
