Data Science Expert Bader Looks to Fed Funding for Info Analysis
Data science has reached a point where techniques such as deep learning can beat humans at recognizing objects, although experts are still figuring out how to make explainable predictions from massive data, NJIT distinguished professor David Bader said.
Bader leads the university's Institute for Data Science in collaboration with the Ying Wu College of Computing, Newark College of Engineering, Martin Tuchman School of Management, and College of Science and Liberal Arts. He also advises the National Strategic Computing Initiative, founded in 2015 under President Obama through the White House’s Office of Science and Technology Policy, which released a report today reaffirming past goals and emphasizing the needs for federal funding of new hardware types; partnerships between academia, government, and industry; and creative approaches to data science and cybersecurity.
"In 2021 the U.S. plans to achieve exascale and we have to start planning now as to how we're going to address global grand challenges post-exascale," Bader noted.
Today's largest data sets are measured in terabytes (1,000 gigabytes) or even petabytes (1,000 terabytes). Exascale refers to data sets that are 1,000 petabytes, which is 1 billion gigabytes.
"The landscape of applications in computing and data science for real-world grand challenges has changed since the original report in 2015," Bader said. "Data now incorporates every sector and aspect of life. ... Companies are now built around algorithms and the data that drives those. That's a fundamental shift from where high-performance computing was a number of years ago more narrowly focused on scientific computation for uses in weather prediction, manufacturing, and energy."
"What's next is to understand the new opportunities that come through support from the federal agencies, for instance National Science Foundation, National Institutes of Health, Department of Energy, among others, where we collaborate in partnership with industry and government within strategic priorities," he continued.
It's important for people to understand what data science has accomplished, but also to understand where it needs to be better, Bader acknowledged. The technology is good at sorting large amounts of information for humans to analyze. Next, "Where we are still learning is how to better protect the nation from cyberattack, and how to better discover rare patterns in data sets that may have huge ramifications to organizations," he added.
Bader said a milestone will be when data science can suggest answers to questions we didn't yet know to ask. He calls it a Carnac moment, in honor of Johnny Carson's Carnac the Magnificent character who comically provided answers before questions were asked.
It's a seriously difficult challenge for the top experts and top computing systems available today. Bader posed several questions now being asked — "How do we let data speak to us and be able to help us inform how to ask the right questions, and what is the power that we get out of that? What is the power of putting data sets together with each other and what is the possibility of innovation we can have across many sectors? How do we put data together to make our lives better, safer, and more interesting," he said.
Forthcoming research opportunities for the NJIT Institute for Data Science and the data science community generally may lead to answers.
For more information or interview requests, please contact Tanya Klein at tanya.m.klein@njit.edu or (973) 596-3433.