Jon Barron

I am a staff research scientist at Google Research, where I work on computer vision and computational photography. At Google I've worked on Lens Blur, HDR+, Jump, Portrait Mode, and Glass.

I did my PhD at UC Berkeley, where I was advised by Jitendra Malik and funded by the NSF GRFP. I've spent time at Google[x], MIT CSAIL, Captricity, NASA Ames, Google NYC, the NYU MRL, Novartis, and I did my bachelors at the University of Toronto.

Email  /  CV  /  Biography  /  Google Scholar  /  LinkedIn


I'm interested in computer vision, machine learning, statistics, optimization, image processing, virtual reality, and computational photography. Much of my research is about inferring the physical world (shape, depth, motion, paint, light, colors, etc) from images. I have also worked in astronomy and biology. Representative papers are highlighted.

Unprocessing Images for Learned Raw Denoising
Tim Brooks, Ben Mildenhall, Tianfan Xue, Jiawen Chen, Dillon Sharlet, Jonathan T. Barron
arXiv Preprint, 2018
project page

We can learn a better denoising model by processing and unprocessing images the same way a camera does.

Learning to Synthesize Motion Blur
Tim Brooks, Jonathan T. Barron
arXiv Preprint, 2018
project page / video

Frame interpolation techniques can be used to train a network to directly synthesize linear motion blur.

A General and Adaptive Robust Loss Function
Jonathan T. Barron
arXiv Preprint, 2018

A single robust loss function is a superset of many other common robust loss functions, and allows training to automatically adapt the robustness of its own loss.

Depth from Motion for Smartphone AR
Julien Valentin, Adarsh Kowdle, Jonathan T. Barron, Neal Wadhwa, and others
SIGGRAPH Asia, 2018

Depth cues from camera motion allow for real-time occlusion effects in augmented reality applications.

Synthetic Depth-of-Field with a Single-Camera Mobile Phone
Neal Wadhwa, Rahul Garg, David E. Jacobs, Bryan E. Feldman, Nori Kanazawa, Robert Carroll, Yair Movshovitz-Attias, Jonathan T. Barron, Yael Pritch, Marc Levoy
arxiv / blog post / bibtex

Dual pixel cameras and semantic segmentation algorithms can be used for shallow depth of field effects.

This system is the basis for "Portrait Mode" on the Google Pixel 2 smartphones

Aperture Supervision for Monocular Depth Estimation
Pratul P. Srinivasan, Rahul Garg, Neal Wadhwa, Ren Ng, Jonathan T. Barron
Computer Vision and Pattern Recognition (CVPR), 2018
code / bibtex

Varying a camera's aperture provides a supervisory signal that can teach a neural network to do monocular depth estimation.

Burst Denoising with Kernel Prediction Networks
Ben Mildenhall, Jonathan T. Barron, Jiawen Chen, Dillon Sharlet, Ren Ng, Robert Carroll
Computer Vision and Pattern Recognition (CVPR), 2018   (Spotlight)
supplement / code / bibtex

We train a network to predict linear kernels that denoise noisy bursts from cellphone cameras.

A Hardware-Friendly Bilateral Solver for Real-Time Virtual Reality Video
Amrita Mazumdar, Armin Alaghi, Jonathan T. Barron, David Gallup, Luis Ceze, Mark Oskin, Steven M. Seitz
High-Performance Graphics (HPG), 2017
project page

A reformulation of the bilateral solver can be implemented efficiently on GPUs and FPGAs.

Deep Bilateral Learning for Real-Time Image Enhancement
Michaël Gharbi, Jiawen Chen, Jonathan T. Barron, Samuel W. Hasinoff, Frédo Durand
project page / video / bibtex / press

By training a deep network in bilateral space we can learn a model for high-resolution and real-time image enhancement.

Fast Fourier Color Constancy
Jonathan T. Barron, Yun-Ta Tsai
Computer Vision and Pattern Recognition (CVPR), 2017
supplement / video / bibtex / code / output / blog post / press

Color space can be aliased, allowing white balance models to be learned and evaluated in the frequency domain. This improves accuracy by 13-20% and speed by 250-3000x.

This technology is used by Google Photos and Google Maps.

Jump: Virtual Reality Video
Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernández, Sameer Agarwal, Steven M Seitz
SIGGRAPH Asia, 2016
supplement / video / bibtex / blog post

Using computer vision and a ring of cameras, we can make video for virtual reality headsets that is both stereo and 360°.

This technology is used by Jump.

Burst Photography for High Dynamic Range and Low-Light Imaging on Mobile Cameras
Samuel W. Hasinoff, Dillon Sharlet, Ryan Geiss, Andrew Adams, Jonathan T. Barron, Florian Kainz, Jiawen Chen, Marc Levoy
SIGGRAPH Asia, 2016
project page / supplement / bibtex

Mobile phones can take beautiful photographs in low-light or high dynamic range environments by aligning and merging a burst of images.

This technology is used by the Nexus HDR+ feature.

The Fast Bilateral Solver
Jonathan T. Barron, Ben Poole
European Conference on Computer Vision (ECCV), 2016   (Best Paper Honorable Mention)
arXiv / supplement / bibtex / video (they messed up my slides, use →) / keynote (or PDF) / code / depth super-res results / reviews

Our solver smooths things better than other filters and faster than other optimization algorithms, and you can backprop through it.

Geometric Calibration for Mobile, Stereo, Autofocus Cameras
Stephen DiVerdi, Jonathan T. Barron
Winter Conference on Applications of Computer Vision (WACV), 2016

Standard techniques for stereo calibration don't work for cheap mobile cameras.

Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform
Computer Vision and Pattern Recognition (CVPR), 2016
Liang-Chieh Chen, Jonathan T. Barron, George Papandreou, Kevin Murphy, Alan L. Yuille
bibtex / project page / code

By integrating an edge-aware filter into a convolutional neural network we can learn an edge-detector while improving semantic segmentation.

Convolutional Color Constancy
Jonathan T. Barron
International Conference on Computer Vision (ICCV), 2015
supplement / bibtex / video (or mp4)

By framing white balance as a chroma localization task we can discriminatively learn a color constancy model that beats the state-of-the-art by 40%.

Scene Intrinsics and Depth from a Single Image
Evan Shelhamer, Jonathan T. Barron, Trevor Darrell
International Conference on Computer Vision (ICCV) Workshop, 2015

The monocular depth estimates produced by fully convolutional networks can be used to inform intrinsic image estimation.

Fast Bilateral-Space Stereo for Synthetic Defocus
Jonathan T. Barron, Andrew Adams, YiChang Shih, Carlos Hernández
Computer Vision and Pattern Recognition (CVPR), 2015   (Oral Presentation)
abstract / supplement / bibtex / talk / keynote (or PDF)

By embedding a stereo optimization problem in "bilateral-space" we can very quickly solve for an edge-aware depth map, letting us render beautiful depth-of-field effects.

This technology is used by the Google Camera "Lens Blur" feature.


Multiscale Combinatorial Grouping for Image Segmentation and Object Proposal Generation
Jordi Pont-Tuset, Pablo Arbeláez, Jonathan T. Barron, Ferran Marqués, Jitendra Malik
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) , 2017
project page / bibtex / fast eigenvector code

We produce state-of-the-art contours, regions and object candidates, and we compute normalized-cuts eigenvectors 20× faster.

This paper subsumes our CVPR 2014 paper.

Shape, Illumination, and Reflectance from Shading
Jonathan T. Barron, Jitendra Malik
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015
supplement / bibtex / keynote (or powerpoint, PDF) / video / code & data / rant / kudos

We present SIRFS, which can estimate shape, chromatic illumination, reflectance, and shading from a single image of an masked object.

This paper subsumes our CVPR 2011, CVPR 2012, and ECCV 2012 papers.


Multiscale Combinatorial Grouping
Pablo Arbeláez, Jordi Pont-Tuset, Jonathan T. Barron, Ferran Marqués, Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2014
project page / bibtex

This paper is subsumed by our journal paper.

Volumetric Semantic Segmentation using Pyramid Context Features
Jonathan T. Barron, Pablo Arbeláez, Soile V. E. Keränen, Mark D. Biggin,
David W. Knowles, Jitendra Malik
International Conference on Computer Vision (ICCV), 2013
supplement / poster / bibtex / video 1 (or mp4) / video 2 (or mp4) / code & data

We present a technique for efficient per-voxel linear classification, which enables accurate and fast semantic segmentation of volumetric Drosophila imagery.


3D Self-Portraits
Hao Li, Etienne Vouga, Anton Gudym, Linjie Luo, Jonathan T. Barron, Gleb Gusev
SIGGRAPH Asia, 2013
video / / bibtex

Our system allows users to create textured 3D models of themselves in arbitrary poses using only a single 3D sensor.

Intrinsic Scene Properties from a Single RGB-D Image
Jonathan T. Barron, Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2013   (Oral Presentation)
supplement / bibtex / talk / keynote (or powerpoint, PDF) / code & data

By embedding mixtures of shapes & lights into a soft segmentation of an image, and by leveraging the output of the Kinect, we can extend SIRFS to scenes.

TPAMI Journal version: version / bibtex


Boundary Cues for 3D Object Shape Recovery
Kevin Karsch, Zicheng Liao, Jason Rock, Jonathan T. Barron, Derek Hoiem
Computer Vision and Pattern Recognition (CVPR), 2013
supplement / bibtex

Boundary cues (like occlusions and folds) can be used for shape reconstruction, which improves object recognition for humans and computers.

Color Constancy, Intrinsic Images, and Shape Estimation
Jonathan T. Barron, Jitendra Malik
European Conference on Computer Vision (ECCV), 2012
supplement / bibtex / poster / video

This paper is subsumed by SIRFS.

Shape, Albedo, and Illumination from a Single Image of an Unknown Object
Jonathan T. Barron, Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2012
supplement / bibtex / poster

This paper is subsumed by SIRFS.


A Category-Level 3-D Object Dataset: Putting the Kinect to Work
Allison Janoch, Sergey Karayev, Yangqing Jia, Jonathan T. Barron, Mario Fritz, Kate Saenko, Trevor Darrell
International Conference on Computer Vision (ICCV) 3DRR Workshop, 2011
bibtex / "smoothing" code

We present a large RGB-D dataset of indoor scenes and investigate ways to improve object detection using depth information.


High-Frequency Shape and Albedo from Shading using Natural Image Statistics
Jonathan T. Barron, Jitendra Malik
Computer Vision and Pattern Recognition (CVPR), 2011

This paper is subsumed by SIRFS.


Discovering Efficiency in Coarse-To-Fine Texture Classification
Jonathan T. Barron, Jitendra Malik
Technical Report, 2010

We introduce a model and feature representation for joint texture classification and segmentation that learns how to classify accurately and when to classify efficiently. This allows for sub-linear coarse-to-fine classification.


Blind Date: Using Proper Motions to Determine the Ages of Historical Images
Jonathan T. Barron, David W. Hogg, Dustin Lang, Sam Roweis
The Astronomical Journal, 136, 2008

Using the relative motions of stars we can accurately estimate the date of origin of historical astronomical images.


Cleaning the USNO-B Catalog Through Automatic Detection of Optical Artifacts
Jonathan T. Barron, Christopher Stumm, David W. Hogg, Dustin Lang, Sam Roweis
The Astronomical Journal, 135, 2008

We use computer vision techniques to identify and remove diffraction spikes and reflection halos in the USNO-B Catalog.

In use at

Course Projects

Parallelizing Reinforcement Learning
Jonathan T. Barron, Dave Golland, Nicholas J. Hay, 2009

Markov Decision Problems which lie in a low-dimensional latent space can be decomposed, allowing modified RL algorithms to run orders of magnitude faster in parallel.


CS188 - Fall 2010 (GSI)

CS188 - Spring 2011 (GSI)

Feel free to steal this website's source code, just add a link back to my website. Send me an email when you're done and I'll link to your new page from here: