Data for Training Models – How Much Data Do You Need? (panel)


October 25, 2019; Boston, MA


Raj Minhas

Data for Training Models – How Much Data Do You Need? (panel)

Deep learning networks have revolutionized the field of artificial intelligence in the recent years. They have enabled rapid advances in diverse areas such as medical diagnosis, autonomous driving, financial forecasting, material design, and drug discovery. But these deep networks require very large quantities of training data (often requiring millions of labeled data points) in order to achieve their high levels of prediction accuracy. In this panel, we will discuss work being done to reduce this data requirement by a few orders of magnitude. This includes building systems that emulate human cognitive processes (e.g. using visual cues to hasten language acquisition), networks that incorporate knowledge about physical laws (e.g. using symmetries and first principles to constrain the parameter space), and paradigms that combine symbolic and data-driven approaches (e.g. incorporating knowledge representation into network design).
  • What amount of data is really required?
  • The emergence of advanced analytics that require less data
  • Ways to reduce the data requirement by orders of magnitude
  • Small data
  • Synthetic data
Moderator: Ritu Jyoti, Program Vice President, Artificial Intelligence Strategies, IDC; Panelists: Raj Minhas, PhD, Vice President, Director of Interaction and Analytics Laboratory, PARC; Karen Myers, PhD, Lab Director, SRI International's Artificial Intelligence Center; Lucas Show, CoFounder, ProteinQure

Additional information

Focus Areas

Our work is centered around a series of Focus Areas that we believe are the future of science and technology.

Licensing & Commercialization Opportunities

We’re continually developing new technologies, many of which are available for Commercialization.


Our scientists and staffers are active members and contributors to the science and technology communities.