NJIT's Basu Roy Integrates Humans and Computers to Optimize Tasks and Learn Facts
In the early days of the SARS-CoV-2 pandemic, researchers scrambled to decipher the novel virus — its transmission pathways, its effects on the body, its vulnerabilities. Senjuti Basu Roy, a computer scientist, wondered in turn how lay people absorbed the reams of emerging information they received from social media, weeding fiction from fact.
“I was struck by the amount of information — and disinformation — that was propagated over platforms such as Facebook and Twitter, so I decided to study the conditions in which people best learned these potentially life-saving facts,” says Basu Roy, who develops computational frameworks that integrate humans and machines to optimize tasks. Her recent National Science Foundation CAREER award focuses on methods to crowdsource gig jobs and she seized on a timely application.
She recruited a virtual community through the platform Mechanical Turk and divided it into three cohorts: workers with varying degrees of knowledge about the disease; a control group of professionals with similar expertise such as nurses and emergency medical technicians; and a random control group. Participants each received 10 questions, such as how the disease spreads and the number of days to quarantine, with two days to answer them. Members of the cohorts were required to interact with one another at least three times. They earned bonuses for going beyond that.
“I focused on the first group of people with varying degrees of knowledge, because I’m interested in how people learn things from their peers in these settings and become more skilled. I wanted to see how they make conversation and solicit, share and process that information to get the job done,” she recounts.
While the groups were leaderless, highly skilled people drove the conversation. But all members appeared motivated to learn more, and open to new offerings of updated data. Participants provided evidence for their answers, summations of group discussions and, after the test, reports on what they learned. In some cases, their knowledge went beyond the parameters: One reported learning about the history of the pandemic, including when and how information about its ferocity first emerged from Wuhan.
Her takeaway? “The ideal group for this sort of crowdsourced worker training is one that mixes people with different skill levels in small, compatible communities. Together, they are able to complete a task — in this case successfully answer my questions — while improving their skills.”
More generally, she is investigating ways to train machines to better specify task and worker criteria and then to organize and distribute the work in the most efficient and economical way. To date, this is a painstaking, manual process, as crowdsourcing platforms offer little guidance.
“A publishing house, for example, needs translators to produce a brochure in Arabic, quickly and inexpensively, at 80% of the proficiency of a domain expert. But it’s on the requester to post the job successfully. The platform only finds you workers — it doesn’t tell you how many you need or how to organize them to complete the task efficiently.”
Basu Roy developed what she calls a “middleware system” — a middle layer that sits between multiple stakeholders across platforms in a crowdsourcing ecosystem — that helps requesters set goals and constraints. These include parameters such as thresholds on quality, latency and cost, and an analysis of the workforce that is available to undertake the task. “The algorithm would recommend the number of translators to hire and how they should collaborate, simultaneously or sequentially, for example, and how the work should be broken up,” she explains. “As it learns from job to job, it will refine this guidance while also assessing the availability of workers and their skills.”
For speed and accuracy, many tasks require input from both humans and computers. Machines operate quickly and comprehensively, but fail on some knowledge-intensive tasks. People work slowly, but discriminate in ways machines cannot. Basu Roy, who explores optimization opportunities in human-in-the-loop systems, determines which tasks are best assigned to each and how to coordinate them in processes ranging from cleaning and labeling data — a key quality- control step in training artificial-intelligence systems — to deriving new metrics to improve these models.
Early on in her career, she worked with cardiologists to build machine-learning models to predict which patients were most likely to be readmitted to the hospital within 30 days of discharge. A key finding was that machines detected patterns, but humans came up with useful new metrics to improve the models. A computer measured both end-diastolic volume — the amount of blood in the left ventricle — and ejection fraction — the amount it pumps out with each contraction. But it took a doctor to see that the relationship between them was a strong indicator of heart function.
“Crowdsourcing will be a leading component in the future of work with more and more gig jobs,” she says. “To do this thoughtfully, we must continue to assign humans higher levels of intellectual work.”