Image and Video Understanding with Little or No Supervision
Dec 2, 2015
Professor Jean Ponce gave a lecture on December 2, 2015, titled "Image and Video Understanding with Little or No Supervision".
Professor Ponce started with addressing the problem of understanding the visual content of images and videos using a weak form of supervision, such as the fact that multiple images contain instances of the same objects, or the textual information available in television or film scripts. He then discussed several instances of this problem, including multi-class image cosegmentation, the joint localization and identification of movie characters and their actions, and the assignment of action labels to video frames using temporal ordering constraints. All these problems can be tackled using a discriminative clustering framework, and Professor Ponce presented the underlying models, appropriate relaxations of the corresponding combinatorial optimization problems associated with learning these models, and efficient algorithms for solving the corresponding convex optimization problems. He also showed us some experimental results on standard image benchmarks and feature-length films and concluded with some recent work on completely unsupervised object discovery in image collections.