Four YWCC Data Science Faculty Present at Prestigious NeurlPS 2025
Assistant Professors Yingcong Li, Thanh Nguyen-Tang, Lingxiao Wang and Shuai Zhang, four recent additions to the Department of Data Science in the Ying Wu College of Computing (YWCC), had the honor of presenting their research papers at the 2025 Conference and Workshop on Neural Information Processing Systems (NeurlPS), one of three primary conferences of high impact in machine learning and artificial intelligence research.
With a 24.5% acceptance rate of 21,575 submissions, the unique distinction of having the work of four new faculty members, two with multiple submissions, accepted to the conference at once is a notable achievement for the department and a testament to YWCC’s growing impact on the rapidly advancing fields of data science and artificial intelligence.
Professor Jim Geller, chair of the Department of Data Science, had this to say about how Li, Nguyen-Tang, Wang and Zhang have already contributed to the depth and breadth of what the department has to offer in education as well as breakthroughs in technology that will benefit the institution and society:
“All four young faculty members have joined in the last three years, and two of them last year. We, at the Department of Data Science, are giving ourselves a pat on the back for recognizing the brilliance of these young researchers and bringing them on board at NJIT. Recently, NJIT has been climbing up in diverse rankings, and these outstanding new faculty members should help us to continue our ascent.
Yingcong Li’s paper, “When and How Unlabeled Data Provably Improve In-Context Learning,” done in collaboration with colleagues from the University of Michigan, University of California-Riverside and Bilkent University, considers recent research which shows that in-context learning (ICL) can be effective even when demonstrations have missing or incorrect labels. The team’s study on extensive evaluations on real-world datasets show that their method significantly improves semisupervised tabular performance over standard single pass inference.
Her second paper, “BREAD: Branched Rollouts from Expert Anchors Bridge SFT & RL for Reasoning,” with colleagues from the University of Michigan, investigates the fundamental limitations of standard supervised fine-tuning (SFT) stage + reinforcement learning (RL) paradigm and proposes methods to overcome them.
Research by Thanh Nguyen-Tang and colleagues from Washington State University, University of Minnesota and Oregon State University on “Online Optimization for Offline Safe Reinforcement Learning” studies the problem of Offline Reinforcement Learning (OSRL) with a goal to learn a reward-maximizing policy from fixed data under a cumulative cost constraint.
“Revisiting Consensus Error: A Fine-Grained Analysis of Local SGD under Second-Order Heterogeneity” by Lingxiao Wang and collaborators from Yale University and the CISPA Hellmholtz Center for Information Security confirms conjecture that second-order heterogeneity assumption may suffice to justify empirical gains of local SGD by establishing new upper and lower bounds on the convergence of local SGD.
Shuai Zhang contributed four papers to NeurlPS 2025:
“Contrastive Learning with Data Misalignment: Feature Purity, Training Dynamics, and Theoretical Generalization Guarantees” (collaborators: NJIT, University of Louisiana at Lafayette, Rensselaer Polytechnic Institute) presents the first theoretical analysis of contrastive learning in the presence of misaligned modality pairs.
“Self-Training with Dynamic Weighting for Robust Gradual Domain Adaptation” (collaborators: University of Electronic Science and Technology of China, China Southwest Jiaotong University, Jinzhu Wei Shanghai University, Tsinghua University) proposes self-training with dynamic weighting (STDW), a new method designed to improve robustness and stability in GDA.
“On the Training Dynamics of Contrastive Learning with Imbalanced Feature Distributions: A Theoretical Study of Feature Learning” (collaborators: NJIT, University of Louisiana at Lafayette, Rensselaer Polytechnic Institute) develops a theoretical framework to analyze the training dynamics of contrastive learning with Transformer-based encoders in the presence of data imbalance.
“Theoretical Analysis of the Selection Mechanism in Mamba: Training Dynamics and Generalization” (collaborators: NJIT, University of Louisiana at Lafayette, Rensselaer Polytechnic Institute) provides a first-step theoretical analysis of the selection mechanism in Mamba.