I am an associate research professor within the Robotics Institute at Carnegie Mellon University, where I am part of the Computer Vision Group, and leader of the CI2CV Laboratory. Before returning to CMU I was a Principle Research Scientist at the CSIRO (Australia's premiere government science organization) for 5 years. A central goal of mine long term is to stay true, in terms of research and industry engagement, to the grand goals of computer vision as a scientific discipline. I believe these goals to be strongly correlated with the founding philosophy of the Robotics Institute itself (i.e. of deep understanding for systems that work) . Specifically, although we want to develop seeing machines that work and have societal/economic impact, we do not want to replace “understanding” for “data”. Instead we want to draw inspiration from vision researchers of the past to attempt to unlock computational and mathematic models that underly the processes of visual perception using machine learning as a tool not a replacement.
A list of my research interests can be found below:-
- Mobile Computer Vision: Computer vision is a discipline that attempts to extract information from images and videos. Nearly every smart device on the planet has a camera, and people are increasingly interested in how to develop apps that use computer vision to perform an ever expanding list of things including: 3D mapping, photo/image search, people/object tracking, augmented reality etc. Notable examples of our work in this space includes our recent work at ECCV'14, and recent keynote I gave at the SUAS ACCV'14 workshop. Notable commercial applications of our work in this space includes the Glasses.com "virtual try on" App.
- Model Based Vision: Modeling the 3D geometry of objects is an onerous task for computer vision, but one which holds many benefits: arbitrary viewpoints and occlusion patterns can be rendered and recognized, and reasoning about interactions between objects and scenes is more readily facilitated. This higher-level reasoning about the 3D position and placement of objects has myriad applications in fields beyond computer vision: aiding the blind in understanding and interacting with the world around them, autonomous navigation, visual search and query, as well as assisting the development of more general cognitive problems in artificial intelligence concerning geometric reasoning and inference. Notable papers in this space include our work in CVPR'14 and PAMI'15.
- The Role of Alignment and Learning: The use of increasingly more complex representations, either hand-tuned (e.g. SIFT) or learned (e.g. Convolutional Neural Network), for detection, tracking and classification has resulted in substantial improvements in vision systems over the last few years. We have been exploring the link between geometric alignment and these complex representations. See our recent work from ECCV'12 and a recent keynote I gave at the CV4AC ACCV'14 workshop.
- Facial and Physical Behavior: The last two decades have seen an escalating interest in methods for automating the coding of facial and body behavior. Applications for such systems extend from the legal and business fields to national security and mental health. More generally, the expansion of behavior research holds tremendous promise for advancing our understanding of basic social and emotional processes. This deepening understanding is key for heralding a long awaited new era in artificial intelligence (AI) where machines (robots, computers, mobile devices, etc.) interact, anticipate and plan seamlessly with humans. Yet, despite this keen interest, the reality is that the promise of computer vision systems to efficiently and accurately code behavior (such as facial action codes or body intent) in naturally occurring circumstances remains elusive. Notable works in this space include our seminal work in real-time facial landmark alignment (see IJCV'11) and our OpenSource Face Analysis SDK.
- Hatem Alismail (co-advised with Brett Browning)
- Chen Kong
- Christoper Ham
- Ashton Fagg
- Iman Abbasnejad
- Chen-Hsuan Lin
Former Graduate Students
- Jack Valmadre (now at Oxford University in U.K.)
- Hilton Bristow (now at Uber)
- Yingying Zhu (now at University of Maryland)
- Mark Cox (now at CSIRO in Australia)
- Ahmed Bilal Ashraf (now at University of Pennsylvania)
- Jesus Nuevo Chiquero (Co-Founder of True Vision Solutions)
- Jason Saragih (now at Occulus)
- Yang Wang (now at Siemens)
- Fall 2015: 16-432: Designing Computer Vision Apps
Cool Videos of our Work
Check out this video depicting our recent commercial work on the Glasses.com "virtual try on" App.