Computer Science and Data Science Tame AI at NJIT
Faculty and student research from NJIT’s Ying Wu College of Computing abounded at Artificial Intelligence Exploration Day, with faculty and dozens of students presenting their timely work.
A trend was the emphasis on unique ways in which AI works — what we collectively understand, what we don’t and what remains mysterious.
Senjuti Basu Roy, associate professor of computer science, along with her doctoral student Subhodeep Ghosh discussed two approaches to mitigating bias in large language models.
“The first approach is adversarial representation learning. Here, we integrate bias mitigation directly into the training process. … We achieve this by introducing an adversary that tries to predict sensitive attributes — such as gender — from the embedding space,” Basu Roy explained. “The embedding model and the adversary are trained in opposition: while the adversary learns to detect bias, the embedding model learns to hide it, especially for neutral words. This results in representations that preserve contextual semantics while minimizing the encoding of protected attributes. The key advantage of this approach is that fairness is built into the model from the ground up.”
The second approach in their talk, Bias In, Bias Out? Rethinking Fairness in LLM Embeddings, is a post-processing method. “We take outputs produced by LLMs and minimally modify them to ensure fairness. This approach is model-agnostic and easy to apply to existing systems,” Basu Roy stated.
Her computer science colleague Keiran Murphy, assistant professor, also teaches in the data science department and listens to AI differently. In AI Communicates through Representations. Can we listen in?, he explained how scientists try to understand what AI is doing by examining the underlying logic of its processes.
“Just trying to shed some light on how neural networks store information, how they organize it, what information is selected, how we can interpret why information is thrown out and which information is thrown out. All of that comes down to representations,” Murphy said. “The analysis of representations is really just like, how do we store the same something that can look very different on the input side, it can be stored then as a series of numbers that have no intelligibility to us, but can clearly be reconstructed back to the original thing perfectly?”
“The change with deep learning is that we are giving off the design of representations to this optimization process. That is so expressive that it can be looking through possibilities that before would have had to be handcrafted. So the design of data representations was in the human domain, until optimization was able to just take it from us,” he continued — but it’s important to consider what might be lost in translation. “I think that the very universal theme is just that any computing deals with changing the representation of data. Really, any science takes something that is external and then has to record it down in a representation of some sort, and that process then is often overlooked or not thought about to where we just take it for granted.”
Student presentations covered equally insightful topics about AI in computer science and data science. These included John Celorio, The AI Contributor Problem, about the decreasing quality of academic journals when faced with AI-developed submissions; Bayan Divaaniaazar, AI-Assistants for Decision-Making via Algorithmic Rankers, regarding fairness in algorithms; Thomas Gammer and Abdul Shaik, Contextral - Data Slop, Privacy Collapse, and the Benefits of Local AI; considering how small language models might hold off so-called AI slop; and Mohith Oduru, Structured Perception for Intelligent Assistants, on the topic of giving AI some worldly context.
Amogh Dalal, Priyansh Patel and Adityasinh Rathod collaborated on Does it Add Up? Characterizing Mathematical Skills in AI models, to provide insights on how AI models use many different skills seemingly at once when evaluating increasingly complex problems.