Plenary Speakers

Monday, December 9, 09:00 - 10:00

Compression for Visual Search and Augmented Reality

[Plenary Slides]

Bernd Girod, Stanford University

Visual search and augmented reality afford a host of intriguing new research problems for the picture coding community. For object recognition on mobile devices, a visual data base is typically stored in the cloud. Hence, for a visual comparison, information must be either uploaded from, or downloaded to, the mobile over a wireless link. The response time of the system critically depends on how much information must be transferred in both directions, and efficient compression is the key to a good user experience. We review recent advances in mobile visual search, using compact feature descriptors, and show that dramatic speed-ups and power savings are possible by considering recognition, compression, and retrieval jointly. For augmented reality applications, where image matching is performed continually at video frame rates, interframe coding of SIFT descriptors achieves bit-rate reductions of 1-2 orders of magnitude relative to advanced video coding techniques. We will use real-time implementations for different example applications, such as recognition of landmarks, media covers or printed documents, to show the benefits of implementing computer vision algorithms on the mobile device, in the cloud, or both.

Bernd Girod is Professor of Electrical Engineering in the Information Systems Laboratory of Stanford University, California, since 1999. Previously, he was a Professor in the Electrical Engineering Department of the University of Erlangen-Nuremberg. His current research interests are in the area of networked media systems. He has published over 500 conference and journal papers and 6 books, receiving the EURASIP Signal Processing Best Paper Award in 2002, the IEEE Multimedia Communication Best Paper Award in 2007, the EURASIP Image Communication Best Paper Award in 2008, the EURASIP Signal Processing Most Cited Paper Award in 2008, as well as the EURASIP Technical Achievement Award in 2004 and the Technical Achievement Award of the IEEE Signal Processing Society in 2011. As an entrepreneur, Professor Girod has been involved in several startup ventures, among them Polycom, Vivo Software, 8x8, and RealNetworks. He received an Engineering Doctorate from University of Hannover, Germany, and an M.S. Degree from Georgia Institute of Technology. Prof. Girod is a Fellow of the IEEE, a EURASIP Fellow, and a member of the German National Academy of Sciences (Leopoldina). He currently serves Stanford’s School of Engineering as Senior Associate Dean for Online Learning and Professional Development.

Tuesday, December 10, 09:00 - 10:00

Data-adaptive Filtering and the State of the Art in Image Processing

[Plenary Slides]

Peyman Milanfar, University of California, Santa Cruz and Google X

Recent approaches to processing and restoration of images and video have brought together several powerful data-adaptive methods from different fields of work. Examples include Moving Least Square (from computer graphics), the Bilateral Filter and Anisotropic Diffusion (from computer vision), Boosting and Spectral Methods (from Machine Learning), Non-local Means (from Signal Processing), Bregman Iterations, and Graph Theory (from Applied Math), Functional Gradient Descent, Kernel Regression and Iterative Scaling (from Statistics). These approaches are deeply connected; and in this talk, I will present a framework for understanding some common underpinnings of these methods. This leads to new insights and a broader understanding of how these diverse methods interrelate, leading to state of the art results.

Peyman Milanfar received the BS degree in electrical engineering and mathematics from the University of California, Berkeley, and the MS and PhD degrees in electrical engineering from the Massachusetts Institute of Technology. Until 1999, he was at SRI International, Menlo Park, California, and a consulting Professor of CS at Stanford. He won a US National Science Foundation Career award in 2000, and the best paper award from the IEEE Signal Processing Society in 2010. He is on leave from his position as Professor of Electrical Engineering at University of California, Santa Cruz. He is currently a scientist at Google-X, where he works on Google Glass, and more generally in Computational Photography. He is a fellow of the IEEE.

Wednesday, December 11, 09:00 - 10:00

Scalable Media Coding: Applications and New Directions

[Plenary Slides] Other Formats

David Taubman, University of New South Wales

Scalable media compression algorithms are desirable because they allow the content to be compressed without prior knowledge of the set of bit-rates and/or resolutions at which it is to be distributed and decoded. Scalable image compression technologies are well understood and highly competitive with non-scalable variants. In fact, applications that rely upon the scalability and interactive accessibility features of the highly scalable image compression standard JPEG2000 have been expanding in recent times and this includes some video applications. Highly scalable video compression itself is fundamentally more challenging than scalable image compression; one major reason for this is the lack of highly scalable motion fields. In this presentation, the speaker will describe scalable representations for motion and depth fields, unified through a generic fully scalable description of the boundary discontinuities that typically dominate such auxiliary media types. The talk will highlight what has been achieved so far, but also draw attention to some of the open research problems that are of particular interest, building towards a highly scalable media coding framework that is as free as possible from artificial constructs such as blocks.

David Taubman is with the School of Electrical Engineering and Telecommunications, at the University of New South Wales, where he heads the Telecommunications Research Group. Before joining UNSW at the end of 1998, he spent 4 years at Hewlett-Packard’s research laboratories in Palo Alto, California. He received the B.S. and B.E. (Electrical) degrees in 1986 and 1988 from the University of Sydney, Australia, and the M.S. and Ph.D. degrees in 1992 and 1994 from the University of California at Berkeley. He has contributed extensively to the JPEG2000 standard for image compression and the JPIP standard for interactive image communication and continues to contribute to these technologies. He is author, with Michael Marcellin, of the book “JPEG2000: Image compression fundamentals, standards and practice” and author of the popular Kakadu software for JPEG2000 developers. He is recipient of two IEEE Best Paper awards: for the 1996 paper, "A Common Framework for Rate and Distortion Based Scaling of Highly Scalable Compressed Video;" and for the 2000 paper, "High Performance Scalable Image Compression with EBCOT". His research interests include scalable image and video compression, robust communication of scalable media over unreliable channels, interactive multimedia communication, perceptual modeling of video and statistical inverse problems in imaging.