Meet the Researcher: Raja Bala talks about computer vision, deep learning and explainable technology
Meet the Researcher: Raja Bala talks about computer vision, deep learning and explainable technology
A career spent at the forefront of scientific exploration can bring with it many accolades and achievements. Raja Bala, Principal Scientist in PARC’s computer vision competency, is proof of that. Alongside holding a Ph.D in from Purdue University and a fellowship at IS&T, Raja is among the highest patent holders at PARC (a competitive field).
Today, Raja’s work is driving our research into digital imaging, computer vision and machine learning. Raja joins us to discuss his background, career highlights, current work and the exciting future he sees for computer vision.
Raja. Tell us about the beginning of your career and how you came to work for PARC.
My education was in the field of electrical engineering leading to specialization in digital image processing, at Purdue University, where I got my Ph.D. I explored the science, mathematics and – if you will – the art of digital images, and how they are represented, processed and enhanced.
After university, I joined Xerox in 1993 as a color imaging scientist. At that time Xerox, along with the whole graphic arts industry, was making the transition from black-and-white to color publishing and printing. I led several exciting projects developing color management solutions for Xerox’s flagship printers and scanners. The experience I gained in that period continues to be useful in my work today.
In 2011, I transitioned into the field of computer vision, which enables machines to analyze, interpret, and extract useful information from images and video. I developed techniques for analyzing images and video from mobile and wearable devices, working briefly in an annex of PARC in Rochester, NY.
But this wasn’t your only stint with Xerox or PARC, right?
No. After 22 years with Xerox I went to Samsung to work in their smartphone camera imaging group, specifically on the Galaxy models, which gave me valuable first-hand experience in developing mobile imaging solutions for a leading consumer smartphone product.
Then, around a year and a half ago, I came back to the Xerox fold, but this time with PARC. After a few months of being a part of a machine learning group, I started leading a computer vision team, where I am today.
What does your team do? And how do you make sure your work is different to other labs or research centers?
My team has broad expertise in computer vision, spanning image and video analytics. Everything from tracking skin health to moving cars to facial expressions to augmented reality. We also specialize in developing innovative mobile computer vision applications for smartphones and tablets, and analyzing images from varied sensors, such as depth and hyperspectral cameras.
Our work is different because of our holistic systems approach to computer vision, which optimizes the interplay between scene, camera, algorithms, human operator and business application.
Right now, a major research theme for us is artificial intelligence, deep learning, and its application to computer vision.
Why is deep learning in computer vision such a focus today?
Deep learning is a really effective way to extract useful patterns from images, which is often our goal. It works by training a neural network with lots of example images along with associated patterns or some underlying ‘ground truth” about the images. The network learns a set of connections and weights that enable it to identify the same type of pattern or truth from new images.
One of the challenges with deep learning is that it requires a huge number of training examples to work well. But in many practical applications we don’t have the luxury of collecting and annotating thousands of images. Also, deep learning generally doesn’t perform very well when asked to perform tasks outside of the specific training context.
So, I’m interested in developing new ways to bring prior knowledge about the task and environment into deep learning methods, making them less fragile and dependent on specific training data. We are revisiting some of the first principle models we used prior to deep learning becoming our standard way of working, and marrying these with deep learning techniques. So you have the power of deep learning, combined with smart constraints shaped by the physics, biology or geometry of the task.
We call it domain aware deep learning. One example is in retinal diagnostics, where we train a computer vision system to take in a retinal scan and put out a map of the blood vessels in peoples’ eyes, to be used by clinicians for medical evaluation.
Traditional deep learning would require a million retinal scans and their vessel maps – a luxury we do not have in the medical community today. So we build in domain intelligence that provides clues and constraints to the network, telling it that it should be looking for thin curvy structures that branch out like a tree. This allows us to train the deep network with comparatively few images and yet outperform today’s best deep learning methods.
Where can we see this at work today?
You can find this approach and technology applied in the health and beauty domain with the Proctor & Gamble skincare brand, Olay. PARC’s computer vision technologies power the Olay Skin Advisor, a mobile platform that captures smartphone selfie images of consumers’ faces, ensures the images are of good quality, analyzes them for wrinkles, pores or imperfections, or the health of the skin in general, and delivers product recommendations. This solution wouldn’t work with deep learning alone, as we don’t have the data or computing power. But with our methodology it can – you can read a case study about this project.
What else will you be looking at?
We recognize that humans and computers are really good at different types of visual tasks. So, we’re looking at computer vision with humans in the loop. The idea is to develop methods where machine and human decisions complement and strengthen each other. This theme ties closely with other areas of research at PARC, like the AI and Human-Machine Collaboration Focus Area that looks at making machine learning and AI explainable and understandable to humans.
Before we let you go, tell us about the patents.
Yes. I have over 160. In the early days a lot of them were in digital color imaging. But now they span broader areas of image analysis and computer vision. Capturing IP has always been an important part of the innovation culture at Xerox and PARC. I have been fortunate to be in the company of many innovative colleagues, and the patents I am most proud of are truly collaborative efforts.
Additional information
Our work is centered around a series of Focus Areas that we believe are the future of science and technology.
We’re continually developing new technologies, many of which are available for Commercialization.
PARC scientists and staffers are active members and contributors to the science and technology communities.