Safety of Artificial Intelligence is Focus for CompSci and Applied Math Student

Sakpal, front, presenting on her expertise at an NJIT Women in Computing Society meeting

NJIT makes entrepreneurs and scientists, but junior Nidhi Sakpal is obsessed with something else — she makes AI safer.

Sakpal, an Albert Dorman Honors College member from Boonton double-majoring in applied math and computer science, explained that artificial intelligence safety encompasses the analysis, prevention and rectification of anything that causes AI systems to give users incorrect, harmful or unethical information.

If an AI system were laser-focused on efficiency and not focused enough on safety, Sakpal joked, then it would tell you the fastest way to get your grandmother out of a burning building is to eject her. “It doesn’t understand the constraints of the human world,” but simply passes off raw logic under the guise of reasoning, she said.

She shared advice for all users of large language models — “I think it's really important that users understand how the model itself is outputting. I think it is really important that people are aware of it, especially now when the average person uses LLMs for their emotional needs, or needs where they would to a human.”

“AI capabilities have increased a lot to where it’s integrated into every single workflow, every single company,” Sakpal said. “Not a lot of people on the teams of these big LLM companies, or companies that work on creating really powerful AIs, work on the safety aspect of it to make sure that these models are aligned.”

As a teen, Sakpal envisioned a career applying technology for some good cause. But when mainstream artificial intelligence and its associated risks materialized during her high school years, she learned that the good cause can be technology itself.

Working toward that goal, she received a competitive AI safety fellowship at research bootcamp Algoverse in summer 2025. She stayed in the organization for two mentored cohorts. In the first, they examined how language models can be hacked to allow banned discussions by overloading the input form, called the context window, with extra-long prompts. In the second, they showed that models with larger context windows, measured in a unit called tokens, do not necessarily perform better — and in some cases perform worse, due to losing focus — than models with smaller capacity.

To get into Algoverse, Sakpal had to not only apply with essays but also make it through a week-long intensive training program which less than half of the applicants completed. The research phase then began. Her work was included in a research paper, When Refusals Fail: Unstable Safety Mechanisms in Long-Context LLM Agents, presented at the Association for the Advancement of Artificial Intelligence conference in Singapore and also shared on the Arxiv server.

As her projects at Algoverse wound down, Sakpal wanted more. She was accepted into a similar program with Bluedot Impact, a non-profit organization, where she earned a certificate in technical AI safety and now is doing research there, too. She’ll also work as an AI software engineering intern at New York-based Ariel Partners this summer. The company specializes in AI software for sensitive fields such as government and healthcare.

Sakpal decided to share her knowledge with peers in the spring 2026 semester. She organized a workshop, AI Safety Fundamentals + AI on Campus on behalf of the university’s Women In Computing Society and the OpenAI ChatGPT lab, but open to everyone. The workshop included her presentation about the fundamentals of AI safety, followed by a discussion among all attendees about how they use AI for school and life.

Specific examples of AI risks include hallucinations, where it makes up information and tells you that it’s a fact; reward hacking, where the AI becomes a yes-man even if you’re clearly wrong; and deceptive alignment, referring to AI models trained on biased data, such as favoring age ranges or ethnicities in employment trends.

Sakpal said her favorite way to find or even solve such issues is by using a method called mechanistic interpretability. It’s the art of reverse-engineering the AI outputs to understand how the model arrived at its answer. “A lot of AI researchers argue that it's not possible or not feasible,” but she feels the topic deserves further research. Internal machinations are something an AI can’t lie about, she stated.

AI safety keeps getting worse as new features are developed and released much faster than new safety mechanisms, Sakpal said. But she is determined to help in any way possible. “I’m pretty sure I want AI safety to be a part of my future career. So right now, I'm planning on applying for a master's or M.S./Ph.D. program” and ultimately joining a research laboratory, she said.

“They should be very mindful about how much they let AI in their lives, and through which systems. Even though I work in CS, and I work for AI safety, I limit how much AI connects to my everyday apps and everything. And I feel like, if people understood how much data it uses from you, they would do the same thing. And so I feel it's really important that they understand what the companies are using your data for, or what you're doing when you're connecting to these different applications.”

Safety of Artificial Intelligence is Focus for CompSci and Applied Math Student

Popular Searches