Image recognition with deep learning

Project Information

Computer Science (401) 

Every day, millions of people across the world take photos and upload them to social media websites. Their goal is to share photos with friends and others, but collectively they are creating vast repositories of visual information about the world. Each photo is an observation of how the world looked at a particular point in time and space. Aggregated together, these photos could provide new sources of observational data for use in disciplines like biology, earth science, social science or history. This project is investigating the algorithms and technologies needed for mining these large collections of photographs and noisy metadata to draw inferences about the physical world. The project has four research thrusts: (1) investigating techniques for identifying and correcting noise in metadata like geo-tags and timestamps, (2) developing algorithms for extracting semantic information from images and metadata, (3) creating methods for robust aggregation of noisy evidence from multiple photos, (4) validating these techniques on interdisciplinary applications in biology, sociology, and earth science.

Intellectual Merit

We are investigating the feasibility of using large-scale social image collections for automated observation of the world, by creating new algorithms for visual social media mining by combining analysis of both visual evidence in photographs and non-visual metadata. Statistical and learning-based approaches will be investigated to understand and mitigate the effects of noise and bias in making accurate crowd-sourced observations. Innovative algorithms that leverage large-scale data to improve classic computer vision problems like scene recognition and 3d reconstruction will be investigated. These techniques will be validated on applications from biology, sociology, and ecology, comparing observational estimates produced by social media with actual ground truth data to produce quantitative assessments of accuracy and to characterize advantages and limitations of these approaches.

Broader Impacts

Our project has the potential to create fundamentally new sources of observational data for a variety of scientific disciplines, which will be validated through interdisciplinary collaborations. The project is training students in computer vision and data mining at both the graduate and undergraduate levels.

Project Contact

Project Lead
David Crandall (djcran) 
Project Manager
David Crandall (djcran) 
Project Members
Stefan Lee, Jingya Wang, Eman Hassan, Bardia Doosti, Kai Zhen, Ishtiak Zaman, XUAN DONG, Rakibul Hasan, Shujon Naha, Mingze Xu, Zehua Zhang, Achyut Sarma Boggaram, mridul birla, Sven Bambach, Ali Varamesh, Ramya Rao, Satoshi Tsutsui, Jacob Beauchamp, Siddarth Jayamoorthy, Shashi Shankar, Gurleen Dhody, Yiwei Wei, Luke Lovett, He He, Tasllima Akter, Tousif Ahmed, Katie Spoon, Yuchen Wang, Chuhua Wang, Javier Fuentes-Rohwer, Charles Tostaine, Yingnan Ju, Manjulata Chivukula, Paritosh Morparia, B Fang, B Fang  

Resource Requirements

Hardware System
Use of FutureGrid

We will primarily use GPU nodes for deep learning applications.

Scale of Use

Generally just a handful of GPUs; more for relatively rare large-scale experiments.

Project Timeline

07/19/2016 - 09:49