The program will feature several keynote and invited talks, from academia, the industry, and the government.
Bio-Image Informatics: Advances and Challenges
B. S. Manjunath,
Professor, University of California, Santa Barbara
I will talk about information processing challenges in the context of
microscopy images, and issues/challenges that the
multimedia/information retrieval communities can help address. Recent
advances in microscopy imaging have resulted in large volumes of image
and video data, with most of the analysis still done manually and in a
qualitative manner. Manual analysis is not only time intensive but
often is not reproducible as well. Further, there is little, if
any,database support to manage these image/video collections, to
store, search and retrieve image related information within an
integrated framework. I will illustrate the challenges with some
recent problems that we are trying to tackle at UCSB's Center for
Bio-Image Informatics.
Biography: Professor Manjunath received the B. E. in Electronics (with distinction) from the Bangalore University in 1985, and M. E. (with distinction) in Systems Science and Automation from the Indian Institute of Science in 1987, and the Ph. D. degree in Electrical Engineering from the University of Southern California in 1991. He joined the University of California in 1991 where he is currently a Full Professor of Electrical and Computer Engineering and the director of the NSF/IGERT program at UCSB on Interactive Digital Multimedia. In addition, he is the director of the NSF funded center for Bio-image informatics at UCSB where the research thrust is in developing new imaging and information processing technologies for large bio-image databases. His research interests include image processing, computer vision, multimedia databases, bio-informatics, learning algorithms and data mining. He has supervised 20 PhD theses and published over 200 papers in refereed journals and conferences. He is a co-inventor of 23 US/International patents and co-edited the first book on the MPEG-7 standard. He was an Associate Editor for the IEEE Transactions on Image Processing, IEEE-Tr. PAMI, IEEE-Tr. Multimedia, and IEEE Signal Processing Magazine. He is a fellow of the IEEE.
In this session, we will invite a few active scientists from the leading industrial players in the area of multimedia information retrieval. They will illustrate their latest research findings. Each talk will take about 15-18 minutes.
Multimedia Processing for Advanced Content Services
Behzad Shahraray,
Executive Director, Video and Multimedia Technologies Research, AT&T Labs
The proliferation of network connected media-enabled devices
has given users access to large volumes of information and
entertainment in video form. Taking advantage of these vast video
resources involves the creation of effective mechanisms for searching,
navigating, personalizing, and repurposing video to support
alternative consumption modes. Automated content analysis algorithms
that utilize media processing techniques are the key to the creation
of such mechanisms. Media processing also serves to facilitate
retrieval and navigation of content by enabling multimodal user
interfaces.
In this talk I will discuss some of the media processing research at
AT&T Labs, and describe several prototype systems aimed at giving
users easy access to video and multimedia information on a wide range
of media-enabled devices.
Biography:Behzad Shahraray is the Executive Director of Video and Multimedia
Technologies Research at AT&T Labs. In this role, he leads an effort
aimed at creating advanced media processing technologies and novel
multimedia communications service concepts. He received the
M.S. degree in Electrical Engineering, M.S. degree in Computer,
Information, and Control Engineering, and Ph.D. degree in Electrical
Engineering from the University of Michigan, Ann Arbor. He joined AT&T
Bell Laboratories in 1985 and AT&T Labs Research in 1996. His research
in multimedia processing has been in the areas of multimedia indexing,
multimedia data mining, content-based sampling of video, content
personalization and automated repurposing and authoring of searchable
and browsable multimedia content.
Behzad is the recipient of the AT&T Medal of Science and Technology
for his leadership and technical contributions in content-based
multimedia searching and browsing. His work has been the subject of
numerous technical publications. Behzad holds sixteen US patents in
the areas of image, video, and multimedia processing. He is a Senior
Member of IEEE, a member of the Association for Computing Machinery
(ACM), and is on the editorial board of the International Journal of
Multimedia Tools and Applications.
Semantic Understanding of Geotagged Pictures
Dhiraj Joshi,
Research Scientist,
Intelligent Systems Group, Eastman Kodak Research Labs
Semantic understanding based only on vision cues has been a
challenging problem. This problem is particularly acute when the
application domain is unconstrained photos available on the Internet
or in personal repositories. In recent years, it has been shown that
metadata captured with pictures can provide valuable contextual cues
complementary to the image content and can be used to improve
classification performance. With the recent geotagging phenomenon, an
important piece of metadata available with many pictures is GPS
information. In this talk, I will describe novel research in the area
of mining geographic information for boosting semantic
understanding. I will discuss the association of image content, tags,
and location meta-data with image semantics within a contextual
inference framework. With integrated GPS-capable cameras on the
horizon and geotagging on the rise, this line of research will
revolutionize event recognition and media annotation.
Biography: Dhiraj Joshi is a research scientist in the
Intelligent Systems Group at the Eastman Kodak Research Labs,
Rochester, NY. At Kodak, Dhiraj's primary focus is associating image
content, tags, and location meta-data with image semantics. He is also
interested in building intelligent systems which use semantics across
multiple modalities of media for enriching user experience. Dhiraj
graduated with an M.Sc in Mathematics and Scientific Computing from
the Indian Institute of Technology, Kanpur. He completed his
Ph.D. in Computer Science from the Pennsylvania State University.
His broad research interests include contextual inference-based
image understanding, large-scale image retrieval, content analysis in
multimedia, aesthetics modeling in images, and statistical learning.
Dhiraj has been a research intern at the I.B.M. T.J. Watson Research
Labs, and the Idiap Research Institute (Switzerland). In 2006, he was
selected as an emerging leader in multimedia research to present at
the Watson Emerging Leaders in Multimedia Workshop. He co-organized a
special session on Image Aesthetics, Moods, and Emotions at the IEEE
International Conference on Image Processing, 2008. Dhiraj has also
participated in Kodak Visiting Scientist programme to promote science
and mathematics education in Rochester area schools. He is a member of
IEEE and currently serves as a Rochester chapter vice-chair of IEEE
Signal Processing Society.
Next Generation Map Making: Automation from Mobile Data Collection
Alwar Narayanan,
Director of Research and Emerging Technologies,
NAVTEQ Corporation.
NAVTEQ is a leading global provider of digital map data. NAVTEQ maps drive most in-vehicle navigation systems, the top routing web sites, and the leading brands of wireless navigation devices. NAVTEQ continues to enhance the technologies used for collecting, analyzing, and delivering new content to a wide range of users and devices.
This presentation will address NAVTEQ's perspective on automatic creation and update of a navigable map through the use of high-end mobile data collection sensors and computer vision techniques. Results of research efforts based on large scale, geo-referenced, ground level video and LIDAR data collection as well as various challenging problems related to automatic feature extraction for mapping and navigation will be presented. Specifically, our approach to automatically reconstruct lane level maps, overpasses and traffic signs to create a virtual 3D map will be presented.
Biography: Alwar Narayanan is currently the Director of Research & Emerging Technologies group at NAVEQ. His research focus is to identify technologies that help recreate rich set of navigable map content more accurately and efficiently. Alwar has been leading research projects at NAVEQ since 1997. Prior joining to NAVTEQ, Alwar spent 12 years in teaching and working on various research projects at the Department of Computer Science and Engineering, Indian Institute of Technology, Chennai, India. Alwar holds a M.S. Degree in Computer Science from IIT, Chennai and an MBA from Northern Illionois University, Dekalb, IL. Alwar holds 7 U.S. Patents in the areas of digital mapping.
Multimedia Semantics -- Opportunities and Challenges
Apostol (Paul) Natsev,
IBM T.J. Watson Research Center.
Digital media production and consumption has skyrocketed in recent
years and is now commonplace in many parts of our lives -- from the
way we entertain and inform ourselves to the way we communicate,
socialize, and learn. With the tremendous growth of multimedia come
great opportunities but even greater expectations and challenges.
Traditional approaches of multimedia description based on manual
tagging, production metadata, and link analysis are typically
coarse-grained and inadequate. Advances in semantic understanding of
multimedia content over recent years are instrumental for unlocking
the full potential of multimedia.
In this talk, I will describe a few case studies of multimedia
semantics applications developed at IBM Research to address real world
business problems, with emphasis on the key opportunities and
challenges for each use case. The goal of this talk will be to raise
questions and bring attention to open problems with practical
implications, rather than to prescribe specific answers.
Biography: Dr. Apostol (Paul) Natsev is a Research Staff Member
and Manager of the Multimedia Research Group at the IBM T. J. Watson
Research Center. He received his M.S. (1997) and Ph.D. (2001) degrees
in Computer Science from Duke University, and joined IBM Research in
2001. At IBM, he leads research efforts on multimedia analysis and
retrieval, with an agenda to advance the science and practice of
systems that enable users to manage and search vast repositories of
unstructured multimedia content.
Dr. Natsev is a founding member and current team lead for IBM's
award-winning IMARS project on multimedia analysis and retrieval, with
primary contributions in the areas of semantic, content-based, and
speech-based multimedia indexing and search, as well as video copy
detection. Dr. Natsev is an avid believer in scientific progress
through benchmarking, and has participated actively in a dozen open
evaluation/showcasing campaigns, including the annual NIST TRECVID
Video Retrieval evaluation, the CIVR VideOlympics showcase, and the
CIVR Video Copy Detection showcase.
Dr. Natsev is an author of more than 60 publications and 15
U.S. patents (granted or pending) in the areas of multimedia analysis,
indexing and search, multimedia databases and query optimization. His
research has been recognized with several awards, including the 2004
Wall Street Journal Innovation Award (for IMARS), a 2005 IBM
Outstanding Technical Accomplishment Award, a 2005 ACM Multimedia
Plenary Paper Award, a 2006 ICME Best Poster Award, and the 2008 CIVR
VideOlympics People's Choice Award (for IMARS). He is a Senior Member
of ACM.
Alberto Del Bimbo, University of Florence, Italy
Biography: Professor Del Bimbo
is Full Professor of Computer Engineering and the Director of the Master in Multimedia of the University of Florence, Italy. He was the Director of the Department of Sistemi e Informatica, from 1997 to 2000 and the Deputy Rector for Research and Innovation Transfer of the University of Florence, from 2000 to 2006. Presently he is the President of the Foundation for Research and Innovation and the Director of the Media Integration and Communication Center of Excellence of the University of Florence.
His scientific interests are Pattern Recognition, Image and Video Analysis, Multimedia Information Retrieval and Natural Human Computer Interaction. He has published over 250 publications in some of the most distinguished scientific journals and international conferences, and is the author of the monography "Visual Information Retrieval", on content-based retrieval from image and video databases, edited by Morgan Khaufmann, in 1999.
From 1996 to 2000, he was the President of the IAPR Italian Chapter, and, from 1998 to 2000, Member at Large of the IEEE Publication Board. He was the general Chair of IAPR ICIAP'97, the International Conference on Image Analysis and Processing, IEEE ICMCS'99, the International Conference on Multimedia Computing and Systems, AVIVDiLib'05 the International Workshop on Audio-Visual Content and Information Visualization, VMDL07 the International Workshop on Visual and Multimedia Digital Libraries, IEEE ISM2008, the International Symposium on Multimedia and Program Co-Chair of ACM Multimedia 2008. He is the General Co-Chair of ACM Multimedia 2010 and of ECCV 2012, the European Conference on Computer Vision. He is IAPR Fellow and Associate Editor of Multimedia Tools and Applications, Pattern Analysis and Applications, Journal of Visual Languages and Computing and International Journal of Image and Video Processing, and was Associate Editor of Pattern Recognition, IEEE Transactions on Multimedia and IEEE Transactions on Pattern Analysis and Machine Intelligence.
Implementing a Content-Based Public-Oriented Audio and Video News Retrieval System
Gregory Grefenstette, Chief Science Officer, Exalead
Video is poised to largely replace both text and images as the media
for transmitting information in the coming years. The challenge of the
Information Processing community is how to index the information found
in this voluminous and dynamic media stream. This talk will describe
our current research in providing an index into the content of video
and audio streams. I will also describe and demonstrate the Voxalead
News system, and other results of the French-German Quaero project,
that integrate results from industry and research, for the next
generation of video-based information searching.
Biography: Gregory Grefenstette is Chief Science Officer at Exalead. He received his B.S. from Stanford University in 1978, and a Phd in Computer Science from the University of Pittsburgh in 1993. He has been Principal Scientist at the Xerox Research Centre (1993-2001), with Clairvoyance (2001-3) and at the French applied research centre, the CEA (2001-8). His research interest range from most subjects in Natural Language Processing to all aspects of Information Retrieval. He serves on the Editorial board of the Journal for Natural Language Engineering, and edited the first book on Cross Language Information Retrieval (Kluwer 1998). In recent years, he has been working with Adrian Popescu on Geographical Indexing. He is the co inventor of 15 patents, including the design of a photocopier for cross language information retrieval (US 6396951), for finding experts in a company by mining Web usage (US 6446035) and for creating documents that enrich themselves (US 6732090). He organized the first OntoImage Workshop 2007 on bridging the gap between text processing and image processing.
Recovering the Past through Computation -
New Techniques for Cultural Heritage
Stephen M. Griffin, Program Director, National Science Foundation
Computation has provided new means for researchers and scholars in the
humanities, fine arts and social sciences to address research
questions long considered to be too difficult for conventional
methodologies. The subject of this presentation will be to discuss
emerging state-of-the-art scientific methodologies applied to
discovery, recovery, restoration, representation, analysis and
ultimately new understanding of a broad range of cultural heritage
artifacts. Critically important remnants of the past are disappearing
- through neglect, incidental destruction, neglect, and deterioration
and looting. Many ancient artifacts are scattered about the world and
reside in public and private collections, inaccessible to scholars and
far removed from their original location and context of creation.
Digital representation is possible for numerous cultural heritage
resources: script and drawings on a variety of media, manuscripts and
documents, images, objects of all shapes and textures, and historic
sites and events to name a few. Computation can provide means for
recovering to some degree what was lost. Computation using geospatial
and temporal data is central to visualizing and understanding
mechanisms of change over extended periods of time, at once revealing
and elucidating the events, social processes and practices that drive
or accompanied change. This task involves, in part, processing massive
amounts of raw data from a wide range of instruments and combining
these with historic records to produce new information. At this point
scholarly work, creative approaches, imaginative thinking and
international interdisciplinary collaboration can be undertaken to
create new knowledge and understanding and bring to light new segments
of the human record.
Biography: Stephen Griffin is a Program Director in the Information Integration and Informatics (III) cluster in the National Science Foundation's Division of Information and Intelligent Systems. For the period 1994-2004, Mr. Griffin managed the Special Projects Program which included the Interagency Digital Libraries Initiatives and the International Digital Libraries Collaborative Research and Applications Testbeds program. Prior to joining the Division of Information and Intelligent Systems, Mr. Griffin served in several research divisions, including the Divisions of Chemistry and Advanced Scientific Computing, the Office of the Assistant Director, Directorate for Computer and Information Science and Engineering, and staff offices of the Director of the NSF. He has been active in working groups for Federal high performance computing and communications programs, and serves on numerous domestic and international advisory committees related to digital libraries and advanced computing and networking infrastructure. In 2004-2005 he was on special assignment to the Library of Congress, Office of Strategic Initiatives, to assist with the National Digital Information and Infrastructure Preservation Program. His educational background includes degrees in Chemical Engineering and Information Systems Technology. He has additional graduate education in organizational behavior and development and the philosophy of science. His research interests are in topics related to interdisciplinary research and scholarly communication. He has been active in promoting cultural heritage informatics and computing and the humanities and arts.