Mining Digital Imagery Data for
Automatic Linguistic Indexing of Pictures
James Z. Wang, Jia Li
The Pennsylvania State University, University Park, PA 16802
Abstract:
In this paper, we present a new research direction, {\it automatic
linguistic indexing of pictures}, for data mining and machine learning
researchers. Automatic linguistic indexing of pictures is an
imperative but highly challenging problem. In our on-going research,
we introduce a statistical modeling approach to this problem.
Computer algorithms have been developed to mine numerical features
automatically extracted from manually annotated categorized images.
These image categories form a computer-generated dictionary of
hundreds of concepts for computers to use in the linguistic annotation
process. In our experimental implementation, we focus on a particular
group of stochastic processes for describing images. We implemented
and tested our ALIP (Automatic Linguistic Indexing of Pictures) system
on a photographic image database of 600 different semantic categories,
each with about 40 training images. Tested using more than 4600
images outside the training database, the system has demonstrated good
accuracy and high potential in linguistic indexing of photographic
images. Such a system can potentially be used in many areas such as
semantic Web and counter terrorism.
Full Paper in Color
(PDF, 0.4MB)
On-line Demo
Copyright 2002 .
Personal use of this
material is permitted. However, permission to reprint/republish this
material for advertising or promotional purposes or for creating new
collective works for resale or redistribution to servers or lists, or
to reuse any copyrighted component of this work in other works, must
be obtained from the publisher.
Last Modified:
October 10 2002
© 2002