Second International Workshop on Visual Analysis and Geo-Localization of Large-Scale Imagery

In conjunction with CVPR 2013, Portland, Oregon June 23, 2013

/CVPR logos

Recent availability of large scale geo-tagged images on social and photo-sharing websites has spurred a growing interest in developing automatic methods for image geolocalization. The problem of visual analysis and geo-localization of large-scale imagery arises in a variety of real-world applications. Consumers of imagery may be interested in determining when and where an image/video was taken, who is in the image, what the different objects in the depicted scene are, and how they are related to each other. Local government agencies may be interested in using large-scale imagery to automatically obtain and index useful geographic and geological features and their distributions in a region of interest. Similarly, local businesses may utilize content statistics to target their marketing based on the ‘where’, ‘what’, and ‘when’ that may automatically be extracted during visual analysis and geo-localization of large-scale imagery.

We believe research on visual analysis and geo-localization of large-scale imagery can greatly benefit by bringing together researchers from areas of computer vision, computer graphics, photogrammetry, computational optimization, geographic information systems, and other related fields. Such a gathering will lay down a foundation for an integrated approach for building earth-scale models and their exploitation for automatic geolocalization. In addition, complementary viewpoints and techniques from these diverse areas will provide additional insight into the problem domain and spur new research directions. The focus will be on exchange of ideas on how to develop visual analysis and geo-localization capabilities that make use of vast amount of contextual information available on the internet. As a byproduct, the relevant communities will also benefit as this workshop will lead to improved methods for data-driven modeling and analysis of large-scale imagery. Papers describing novel and original research are solicited in the areas related to visual analysis and geo-localization of large-scale imagery. Topics of interest include but not limited to:

Visual Feature and Information Extraction from Large-Scale Imagery
Understanding and Modeling Uncertainties in Visual and Geospatial Data
Semantic Generalization of Visual and Geospatial Data
Representation, Indexing, Storage, and Analysis of City-to-Earth Scale Models
Automated 3D Modeling Pipelines for Complex Large-Scale Architectures
Integrated Processing of Point Clouds, Image, and Video Data
Multi-Modal Visual Sensor Data Fusion
Control Mechanisms that aid in Visual Analysis and Geo-Localization
Rendering and Visualization of Large-Scale Models, Semantic Labels and Imagery
Applications of Visual Analysis and Geo-Localization of Large-Scale Imagery
Datasets and Model Validation Methods for Analysis and Geo-localization Research

Call for Papers

SUBMISSION

Top

Papers should describe original and unpublished work about the above or closely related topics. Each paper will receive double blind reviews, moderated by the workshop chairs. Authors should take into account the following:

All papers must be written in English and submitted in PDF format.
Papers must be submitted online through the CVPR submission CMT system.
The maximum paper length is 6 pages, with the option of purchasing 2 additional pages. The workshop paper format guidelines are the same as the Main Conference papers.
Submissions will be rejected without review if they: contain more than 8 pages, violate the double-blind policy or violate the dual-submission policy.
Authors will have the opportunity to submit up to 10MB of supporting material.

The author kit provides a LaTeX2e template for submissions, and an example paper to demonstrate the format. Please refer to this example for detailed formatting instructions.

Author Kit (gzipped tar file): cvpr2013AuthorKit.tgz
Author Kit (zip file): cvpr2013AuthorKit.zip

A paper ID will be allocated to you during submission. Please replace the asterisks in the example paper with your paper's own ID before uploading your file.

Important Dates

Submissions deadline:

Author notification:

Camera-ready:

Workshop:

March 22, 2013

April 22, 2013

May 3, 2013

June 23, 2013

COMMITTEES

Top

General Chairs

Mubarak Shah

Luc Van Gool

UCF, USA

ETH, Switzerland

Program Chairs

Asaad Hakeem

Jan-Michael Frahm

Alexei Efros

Khurram Shafique

Omar Javed

ObjectVideo, USA

UNC, USA

CMU, USA

ObjectVideo, USA

SRI Sarnoff, USA

Program Committee

Shih-Fu Chang

Rama Chellappa

Saad Ali

Robert Pless

Andrew Bagnell

Yaser Sheikh

Nathan Jacobs

Himaanshu Gupta

Arslan Basharat

Serge Belongie

David Crandall

Zeeshan Rasheed

Anthony Hoogs

Jana Kosecka

Daniel Huttenlocher

B. S. Manjunath

Antonio Torralba

Alper Yilmaz

Grant Schindler

Ram Nevatia

Mei Han

Yanlin Guo

Victor Tom

Michael Tarnowski

Amir Roshan Zamir

Columbia Univ., USA

UMD, USA

SRI Sarnoff, USA

Washington Univ., USA

CMU, USA

Univ. of Kentucky, USA

ObjectVideo, USA

Kitware, USA

UCSD, USA

Indiana Univ., USA

ObjectVideo, USA

Kitware, USA

GMU, USA

Cornell Univ., USA

UCSB, USA

MIT, USA

OSU, USA

Georgia Tech, USA

USC, USA

Google Research, USA

SAIC, USA

BAE Systems, USA

ARA, USA

UCF, USA

INVITED TALKS

Top

Marc Pollefeys

Cordelia Schmid

James Hays

Josef Sivic

Noah Snavely

Yang Song

ETH, Switzerland

INRIA, France

Brown University, USA

INRIA, France

Cornell University, USA

Google Research

More invited talk title and abstract details to be announced.

Visual geo-localization in urban and mountainous regions
Marc Pollefeys, ETH Zurich

Abstract: In this talk I will present our results on geo-localization from images. First, I will focus on urban areas where facades provide rich visual information, which can be used effectively for geo-localization. Then, we will move to more challenging terrain, in particular mountaineous areas, and show that here also visual geo-localization can be achieved. I will also talk about some efforts in indoor navigation assistance.

Short bio: Marc Pollefeys is a full professor in the Dept. of Computer Science of ETH Zurich since 2007 where he is the head of the Institute for Visual Computing and leads the Computer Vision and Geometry lab. He currently also remains associated with the Dept. of Computer Science of the University of North Carolina at Chapel Hill where he started as an assistant professor in 2002 and became an associate professor in 2005. Before this he was a postdoctoral researcher at the Katholieke Universiteit Leuven in Belgium, where he also received his M.S. and Ph.D. degrees in 1994 and 1999, respectively. His main area of research is computer vision. One of his main research goals is to develop flexible approaches to capture visual representations of real world objects, scenes and events. Dr. Pollefeys has received several prizes for his research, including a Marr prize. He is the author or co-author of more than 200 peer-reviewed publications. He is the General Chair for the European Conference on Computer Vision 2014 (ECCV) and was a Program Co-Chair for the IEEE Conference on Computer Vision and Pattern Recognition 2009 (CVPR) and several 3D conferences. Prof. Pollefeys is on the Editorial Board of the International Journal of Computer Vision and was a associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence. He is a Fellow of the IEEE.

Learning representations for visual place recognition
Josef Sivic, INRIA/Ecole Normale Superieure, Paris

Abstract: We consider the problem of visual place recognition: given the query image of a particular street or a building facade, the objective is to find one or more images in the geotagged database depicting the same place and estimate the camera location of the query.

First, we cast the place recognition problem as a classification task and use the available geotags to train a classifier for each location in the database in a similar manner to per exemplar SVMs in object recognition.

Second, we describe a representation of repeated image structures based on a simple modification of their weights in the bag-of-visual-word model and show that appropriate weighting of repeated image elements can significantly improve place recognition performance.

Finally, we describe a compact representation of 3D scenes, where an entire architectural site is represented by a small set of discriminative visual elements that are automatically learnt from rendered views. We demonstrate the learnt 3D visual elements can be used to match and localize historical and non-photographic imagery where the standard image representations based on local invariant features fail.

Results will be shown on collections of Internet images from Google street-view and Flickr as well as historical and non-photographic imagery.

Joint work with M. Aubry, P. Gronat, G. Obozinski, M. Okutomi, T. Pajdla, B. Russell and A. Torii.

Short bio: Josef Sivic received a degree from the Czech Technical University, Prague, in 2002 and the PhD degree from the University of Oxford in 2006. His thesis dealing with efficient visual search of images and videos was awarded the British Machine Vision Association 2007 Sullivan Thesis Prize and was short listed for the British Computer Society 2007 Distinguished Dissertation Award. His research interests include visual search and object recognition applied to large image and video collections. After spending six months as a postdoctoral researcher in the Computer Science and Artificial Intelligence Laboratory at the Massachusetts Institute of Technology, he currently holds a permanent position as an INRIA researcher at the Departement d’Informatique, Ecole Normale Superieure, Paris. He has published over 40 scientific publications and serves as an Associate Editor for the International Journal of Computer Vision.

Place Graphs
Noah Snavely, Cornell University

Abstract: At the last workshop on geolocation at ECCV 2012, an interesting theme emerged on the representation of places -- from sets of images, to images with relations, to 3D models, to sets of geographically distinctive elements. In our recent work, we have been exploring graphs on images -- where images are nodes with visual connections -- as a more structured way to represent locations. We show how combining these graphs with discriminative learning techniques can yield better performance for location recognition.

Short bio: Noah Snavely is an assistant professor of Computer Science at Cornell University, where he has been on the faculty since 2009. He received a Ph.D. in Computer Science and Engineering from the University of Washington in 2008. Noah works in computer graphics and computer vision, with a particular interest in using vast amounts of imagery from the Internet to reconstruct and visualize our world in 3D. His work was the basis for Microsoft's Photosynth, a tool for building 3D visualizations from photo collections that has been used by many thousands of people. Noah is the recipient of a Microsoft New Faculty Fellowship and an NSF CAREER Award, and has been recognized by Technology Review's TR35.

PROGRAM

Top

09:00-09:05 Opening notes from Workshop Organizers

09:05-09:40 Invited Speaker - Noah Snavely (Cornell)

09:40-10:15 Invited Speaker - Marc Pollefeys (ETH)

10:15-10:45 Break

10:45-11:20 Invited Speaker - James Hays (Brown)

11:20-11:55 Invited Speaker - Cordelia Schmid (INRIA)

12:00-13:30 Lunch

13:30-13:50 Oral Presentation - 3D Point Cloud Reduction using Mixed-integer Quadratic Programming

13:50-14:10 Oral Presentation - User-Driven Geolocation of Untagged Desert Imagery Using Digital Elevation Models

14:10-14:45 Invited Speaker - Yang Song (Google Research)

14:45-15:20 Invited Speaker - Josef Sivic (INRIA)

15:25-15:55 Break

16:00-17:00 Panel Discussion - Rick Szeliski, Avideh Zakhor, Jiri Matas, Jana Kosecka

PAST EVENTS

Top

First Workshop on Visual Analysis and Geo-Localization of Large Scale Imagery (with ECCV'2012)

/ECCV logos