Research

My research centers on developing intelligent systems that learn and reason reliably across complex, imperfect data environments. I am broadly interested in:

Multimodal Learning under Missingness: Building models that can integrate and reason over heterogeneous data (text, vision, clinical, or sensor-based) even when modalities are partially missing or non-randomly absent.
Domain Adaptation and Robust Transfer: Designing adaptive representation frameworks that enable cross-domain and cross-modality generalization under distribution shifts.
Causal Representation Learning: Discovering latent structures that capture causal relationships, improving interpretability, fairness, and robustness in downstream decision-making.
Applied Natural Language Processing: Applying and evaluating these methods in text-rich domains—such as clinical notes, social media, and human-centered data—to advance trustworthy and data-efficient language understanding.

Publications

Causal Representation Learning from Multimodal Clinical Records

First Author Manuscript in Proceeding to EMNLP 2025 | Supervisor: Dr. Ruoxuan Xiong | Camera-Ready Version

Developed a leakage-free pipeline with dual-graph fusion, Transformer temporal encoding, and representation balancing.
Integrated robust training (relative error, CVaR, Tweedie) with quantile calibration and AR(1) correction for stable deployment.

SMM4H-HeaRD 2025 Workshop at ICWSM

First Author Manuscript in Proceeding to SMM4H-HeaRD 2025 | Supervisor: Dr. Azra Ismail | Camera-Ready Version

Applied data-centric techniques including GPT-4–based augmentation, class-weighted loss, and ensemble inference.
Developed multi-label classification models for insomnia detection from MIMIC-III discharge summaries.
Implemented span-based and rule-based extraction of medical and public health entities.

Manuscript

Adaptive Multimodal Recommendation System with Missing Data

First Author Manuscript Under Review at KDD 2026 | Supervisor: Dr. Ruoxuan Xiong

Designed a cross-modal attention mechanism with learnable missingness pattern encoding and temporal features to improve interaction representation.
Proposed sparse-category transfer learning via adaptive category embeddings and cross-category similarity for long-tail performance enhancement.
Incorporated MNAR debiasing through multi-modality selection prediction and inverse probability weighting.

Research in Industrial Projects for Students (RIPS)

Co-Author Manuscript Under Review at ICLR 2026 | Supervisor: Dr. Ida Momennejad | Supported by Institute for Pure and Applied Mathematics of UCLA and Microsoft Research

Co-developed AlgoTrace, a framework linking reasoning traces of large language models (LLMs) with internal representational primitives for interpretable analysis of multi-step reasoning.
Investigated geometric composition of primitives through vector arithmetic, validating cross-task and cross-model generalization between base and reasoning fine-tuned models.

Geometric Compositionality of Algorithmic Primitives in Large Language Models

Co-Author Manuscript; Abstract submitted to JMM 2026 | Supported by Institute for Pure and Applied Mathematics of UCLA and Microsoft Research

Analyzed graph navigation and Traveling Salesman Problem tasks to study how LLMs represent and compose algorithmic primitives.
Demonstrated that function-vector interventions improve reasoning accuracy in LLMs.
Characterized hidden-state geometry and distance-aware attention across different LLMs.

Gated Attention for Short-Text Sentiment Analysis

First Author Manuscript Under Review at IEEE Transactions on Knowledge and Data Engineering

Proposed a novel GDLGA model combining BERT-based global features with local sliding-window attention.
Designed a gated fusion mechanism to dynamically balance global and local sentiment cues in short texts.

Game-Theoretic Analysis of Autonomous Taxis and Labor Market Dynamics

Second Author Manuscript Under Review at Transportation Research Record

Developed a hierarchical dynamic game model capturing interactions among governments, enterprises, and workers under autonomous taxi adoption.
Modeled strategic, behavioral, and feedback mechanisms incorporating psychological biases such as loss aversion and fairness perception.
Conducted simulations and sensitivity analysis to identify optimal policy interventions balancing AI innovation and employment stability.

Collaborative Research Projects

Collective Action & Research for Equity (CARE) Lab

Data Analyst | Supervisor: Dr. Azra Ismail | Oct 2024 – Present

Conducted large-scale data collection and ensured accuracy across diverse datasets.
Applied topic modeling to over 20,000 entries, extracting key themes and insights.

Page updated

Google Sites

Report abuse