I’m undergraduate engineer & researcher in the Dept. of Communications and Electronics Engineering at the Delta Higher Institute for Engineering and Technology (DHIET). Broadly, I’m interested in building vision and multimodal systems whose internal computations can be inspected, stress-tested, and trusted—especially in the foundation-model era, where fluent outputs can hide brittle evidence pathways.
I currently work with Prof. Junsong Yuan at the Visual Computing Lab, University at Buffalo, where I’m developing Latent Visual Diffusion Reasoning (LVDR): a diffusion–latent framework for interpretable volumetric visual inference in medical imaging. Rather than producing single-shot predictions, LVDR exposes trajectory-like intermediate states that can be analyzed and reconstructed to study how evidence accumulates across a scan.
Previously, I worked with Prof. Min Xu at Carnegie Mellon University on representation learning for Cryo-electron Tomography (Cryo-ET), exploring noise-resilient pretraining and equivariant components for low signal-to-noise scientific imaging.
I also serve as Research Lead at Brownian Labs, where I co-founded the Applied Machine Learning Lab. At DHIET, I’m mentored by Prof. El-Sayed M. El-Kenawy, Dr. Nima Khodadadi (UC Berkeley), and Prof. Marwa M. Eid on optimization for applied machine learning, including metaheuristic-driven feature/model selection and efficiency-oriented training pipelines.
PhD interests (Fall 2026): reliable and interpretable computer vision and multimodal learning—building models with inspectable intermediate representations, principled uncertainty/abstention, and strong robustness/evaluation under noise, artifacts, and distribution shift; often leveraging generative and foundation-model paradigms. Contact:Faris.Hamdi.Rizk@gmail.com
Interests
Interpretable & Reliable Vision
Generative Models for Vision
Vision-Language & Multimodal Learning
Optimization for Efficient ML
Education
B.Eng. in Communications and Electronics Engineering
Delta Higher Institute for Engineering and Technology (DHIET)
Recent Publications
(*) indicates equal contribution. Full Updated list available at my Google Scholar page.
Imagine a bustling city street captured by a surveillance camera: pedestrians crossing paths, vehicles maneuvering through traffic, cyclists weaving between lanes, and street vendors interacting with customers. For a computer vision system to make sense of this scene, it needs to not only detect the humans and objects present but also understand their interactions. This complex task is known as Human-Object Interaction (HOI) detection, a critical component for applications like autonomous driving, robotic assistance, and advanced surveillance systems.