I am an undergraduate researcher in Electronics and Communications Engineering at the Delta Higher Institute for Engineering and Technology (DHIET). My research focuses on interpretable and reliable visual learning, with particular interest in generative and foundation-model–era approaches for auditable visual reasoning, uncertainty-aware decision-making, and robustness under distribution shift.
I am grateful to work with Prof. Junsong Yuan at the Visual Computing Lab, State University of New York at Buffalo, where I am developing Latent Visual Diffusion Reasoning (LVDR)—a diffusion-autoencoder–based framework that learns latent reasoning trajectories over multi-slice scans to support disc-/volume-level ordinal grading and presence detection. LVDR is designed to move beyond single-shot predictions by exposing step-by-step internal reasoning states, which can be inspected through trajectory analyses and diffusion-based reconstructions.
Previously, I was fortunate to join Prof. Min Xu’s lab at Carnegie Mellon University, where I contributed to representation learning for Cryo-electron Tomography (Cryo-ET), focusing on foundation-model components for low signal-to-noise settings using equivariant transformers and contrastive learning.
I also serve as Research Lead at Brownian Labs, where I had the opportunity to co-found the Applied Machine Learning Lab. One focus of our lab is computer vision applications, including data-efficient frameworks for dermoscopic image analysis to improve early detection of skin cancer in underrepresented populations.
At DHIET, I am grateful to be mentored by Prof. El-Sayed M. El-Kenawy (Senior Member, IEEE; DHIET), Dr. Nima Khodadadi (UC Berkeley), and Prof. Marwa M. Eid (Senior Member, IEEE; DHIET) on optimization for applied machine learning, including metaheuristic-driven feature/model selection and efficiency-oriented training frameworks such as the Dynamic Binary Swordfish Movement Optimization Algorithm (DBSMOA).
I am currently seeking PhD opportunities for Fall 2026 in Computer Vision and Machine Learning (generative/foundation models, interpretability, robustness, and evaluation), and I would be grateful to connect with researchers and labs pursuing related directions.
Interests
Interpretable & Reliable Vision
Generative Models for Vision
Vision-Language & Multimodal Learning
Optimization for Efficient ML
Education
B.Eng. in Communications and Electronics Engineering
Delta Higher Institute for Engineering and Technology (DHIET)
Recent Publications
(*) indicates equal contribution. Full Updated list available at my Google Scholar page.
Imagine a bustling city street captured by a surveillance camera: pedestrians crossing paths, vehicles maneuvering through traffic, cyclists weaving between lanes, and street vendors interacting with customers. For a computer vision system to make sense of this scene, it needs to not only detect the humans and objects present but also understand their interactions. This complex task is known as Human-Object Interaction (HOI) detection, a critical component for applications like autonomous driving, robotic assistance, and advanced surveillance systems.