TalkMiner:A Lecture Video Search Engine

The design and implementation of a search engine for lecture webcasts is described. A searchable text index is created allowing users to locate material within lecture videos found on a variety of websites such as YouTube and Berkeley webcasts. The searchable index is built from the text of presentation slides appearing in the video along with other associated metadata such as the title and abstract when available.

The automatic identification of distinct slides within the video stream presents several challenges. For example, picture-in-picture compositing of a speaker and a presentation slide, switching cameras, and slide builds confuse basic algorithms for extracting keyframe slide images. Enhanced algorithms are described that improve slide identification.

A public system was deployed to test the algorithms and the utility of the search engine at To date, over 17,000 lecture videos have been indexed from a variety of public sources.

John Adcock FX Palo Alto Laboratory, Inc.
Matthew Cooper FX Palo Alto Laboratory, Inc.
Laurent Denoue FX Palo Alto Laboratory, Inc.
Hamed Pirsiavash University of California, Irvine
Lawrence A. Rowe FX Palo Alto Laboratory, Inc.

To Previous page

Return to Top page