Smartphone Users Can Keep Privacy During Opt-In Behavior Research
Informatics department Assistant Professor Hai Phan is studying how to protect privacy in mobile applications that track a user's behavior.
Phan received a $100,000 grant from a major technology company for the project, Detecting Human Behaviors from Smartphones Using Federated Machine Learning in the Wild.
For example, he said, developers of applications that book taxi services or provide pizza delivery might want to use algorithms to determine optimal driver locations and times. To do so, applications need to track device owners' movements and activities to learn when you're mostly like to want a ride or a bite. However, the data might pass through many hands, such as your phone company and others, well beyond the control of the device owner.
"We give up our data privacy. Everything is about our activity, where we travel. ... We send the data to the service provider and we have no rights, no control over how they use our data," Phan observed. "If we have two companies, they want to jointly work on a machine-learning project, but they could not share the data because it's very sensitive. How could we solve that problem?”
"That is where federated machine learning comes in. The data will stay locally on our mobile devices. It'll go nowhere, but we'll still be able to train our machine-learning model for predictive tasks," Phan explained. He enlisted Cristian Borcea, who teaches computer science and is co-principal investigator, along with doctoral students Han Hu and Xiaopeng Jiang, to help develop the models and prototype. Borcea has presented keynotes on federated learning for mobile sensing data at IEEE conferences on intelligent and service oriented systems engineering, and mobile data management.
"We have to ensure that, from the model parameters, no one can really infer back what happened in our data. ... The end goal is to have a complete and federated framework for multiple applications," Phan said.
The best way to do it, he said, is to perform machine learning on the device itself, rather than doing so on a cloud server controlled by third parties. The advantage is users keep control, and only the aggregate results are sent to developers. Nothing private leaves your device.
The grant is a one-year pilot project, which Phan hopes will turn into a long-term collaboration. Beta tests were just beginning this spring when the COVID-19 pandemic hit. Phan was about to pause his research, which relies on large numbers of people going out and doing things.
Instead, he decided to continue the research anyway, in order to have points of comparison. It’s not known exactly what data and insights may be gained, but hopefully the results will be useful now or during some future pandemic.
Phan said he hopes to have initial observations and results within a few months from now.