We are studying various neural network architectures for activity recognition. Specifically, we want to compare a model that is capable of learning a motion representation based on optical flow, while using significantly less computations time, parameters and resulting in equivalent or better performance. We also want to compare video CNN models using our temporal gaussian mixture layers, which we have found to provide better performance with significantly fewer parameters. Michael Ryoo is my advisor for these projects.
Use of FutureSystems
We need many GPU hours to train our video CNN models.
Scale of Use
We estimate we need at least 15,000 GPU hours to complete our projects.