In conjunction with CVPR 2013, Portland, Oregon June 23, 2013
Recent availability of large scale geo-tagged images on social and photo-sharing websites has spurred a growing interest in developing automatic methods for image geolocalization. The problem of visual analysis and geo-localization of large-scale imagery arises in a variety of real-world applications. Consumers of imagery may be interested in determining when and where an image/video was taken, who is in the image, what the different objects in the depicted scene are, and how they are related to each other. Local government agencies may be interested in using large-scale imagery to automatically obtain and index useful geographic and geological features and their distributions in a region of interest. Similarly, local businesses may utilize content statistics to target their marketing based on the ‘where’, ‘what’, and ‘when’ that may automatically be extracted during visual analysis and geo-localization of large-scale imagery.
We believe research on visual analysis and geo-localization of large-scale imagery can greatly benefit by bringing together researchers from areas of computer vision, computer graphics, photogrammetry, computational optimization, geographic information systems, and other related fields. Such a gathering will lay down a foundation for an integrated approach for building earth-scale models and their exploitation for automatic geolocalization. In addition, complementary viewpoints and techniques from these diverse areas will provide additional insight into the problem domain and spur new research directions. The focus will be on exchange of ideas on how to develop visual analysis and geo-localization capabilities that make use of vast amount of contextual information available on the internet. As a byproduct, the relevant communities will also benefit as this workshop will lead to improved methods for data-driven modeling and analysis of large-scale imagery. Papers describing novel and original research are solicited in the areas related to visual analysis and geo-localization of large-scale imagery. Topics of interest include but not limited to:
Papers should describe original and unpublished work about the above or closely related topics. Each paper will receive double blind reviews, moderated by the workshop chairs. Authors should take into account the following:
The author kit provides a LaTeX2e template for submissions, and an example paper to demonstrate the format. Please refer to this example for detailed formatting instructions.
A paper ID will be allocated to you during submission. Please replace the asterisks in the example paper with your paper's own ID before uploading your file.
March 22, 2013
April 22, 2013
May 3, 2013
June 23, 2013
Luc Van Gool
SRI Sarnoff, USA
B. S. Manjunath
Amir Roshan Zamir
Columbia Univ., USA
SRI Sarnoff, USA
Washington Univ., USA
Univ. of Kentucky, USA
Indiana Univ., USA
Cornell Univ., USA
Georgia Tech, USA
Google Research, USA
BAE Systems, USA
Brown University, USA
Cornell University, USA
More invited talk title and abstract details to be announced.
Visual geo-localization in urban and mountainous regions
Marc Pollefeys, ETH Zurich
Abstract: In this talk I will present our results on geo-localization from images. First, I will focus on urban areas where facades provide rich visual information, which can be used effectively for geo-localization. Then, we will move to more challenging terrain, in particular mountaineous areas, and show that here also visual geo-localization can be achieved. I will also talk about some efforts in indoor navigation assistance.
Short bio: Marc Pollefeys is a full professor in the Dept. of Computer Science of ETH Zurich since 2007 where he is the head of the Institute for Visual Computing and leads the Computer Vision and Geometry lab. He currently also remains associated with the Dept. of Computer Science of the University of North Carolina at Chapel Hill where he started as an assistant professor in 2002 and became an associate professor in 2005. Before this he was a postdoctoral researcher at the Katholieke Universiteit Leuven in Belgium, where he also received his M.S. and Ph.D. degrees in 1994 and 1999, respectively. His main area of research is computer vision. One of his main research goals is to develop flexible approaches to capture visual representations of real world objects, scenes and events. Dr. Pollefeys has received several prizes for his research, including a Marr prize. He is the author or co-author of more than 200 peer-reviewed publications. He is the General Chair for the European Conference on Computer Vision 2014 (ECCV) and was a Program Co-Chair for the IEEE Conference on Computer Vision and Pattern Recognition 2009 (CVPR) and several 3D conferences. Prof. Pollefeys is on the Editorial Board of the International Journal of Computer Vision and was a associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence. He is a Fellow of the IEEE.
Learning representations for visual place recognition
Josef Sivic, INRIA/Ecole Normale Superieure, Paris
Abstract: We consider the problem of visual place recognition: given the query image of a particular street or a building facade, the objective is to find one or more images in the geotagged database depicting the same place and estimate the camera location of the query.
First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per exemplar SVMs in object recognition.
Second, we describe a representation of repeated image structures based on a simple modification of their weights in the bag-of-visual-word model and show that appropriate weighting of repeated image elements can significantly improve place recognition performance.
Finally, we describe a compact representation of 3D scenes, where an entire architectural site is represented by a small set of discriminative visual elements that are automatically learnt from rendered views. We demonstrate the learnt 3D visual elements can be used to match and localize historical and non-photographic imagery where the standard image representations based on local invariant features fail.
Results will be shown on collections of Internet images from Google street-view and Flickr as well as historical and non-photographic imagery.
Joint work with M. Aubry, P. Gronat, G. Obozinski, M. Okutomi, T. Pajdla, B. Russell and A. Torii.
Short bio: Josef Sivic received a degree from the Czech Technical University, Prague, in 2002 and the PhD degree from the University of Oxford in 2006. His thesis dealing with efficient visual search of images and videos was awarded the British Machine Vision Association 2007 Sullivan Thesis Prize and was short listed for the British Computer Society 2007 Distinguished Dissertation Award. His research interests include visual search and object recognition applied to large image and video collections. After spending six months as a postdoctoral researcher in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, he currently holds a permanent position as an INRIA researcher at the Departement d’Informatique, Ecole Normale Superieure, Paris. He has published over 40 scientific publications and serves as an Associate Editor for the International Journal of Computer Vision.
Noah Snavely, Cornell University
Abstract: At the last workshop on geolocation at ECCV 2012, an interesting theme emerged on the representation of places -- from sets of images, to images with relations, to 3D models, to sets of geographically distinctive elements. In our recent work, we have been exploring graphs on images -- where images are nodes with visual connections -- as a more structured way to represent locations. We show how combining these graphs with discriminative learning techniques can yield better performance for location recognition.
Short bio: Noah Snavely is an assistant professor of Computer Science at Cornell University, where he has been on the faculty since 2009. He received a Ph.D. in Computer Science and Engineering from the University of Washington in 2008. Noah works in computer graphics and computer vision, with a particular interest in using vast amounts of imagery from the Internet to reconstruct and visualize our world in 3D. His work was the basis for Microsoft's Photosynth, a tool for building 3D visualizations from photo collections that has been used by many thousands of people. Noah is the recipient of a Microsoft New Faculty Fellowship and an NSF CAREER Award, and has been recognized by Technology Review's TR35.