Research

My primary research interests are data mining and semistructured data. I am also interested in query optimization, data warehousing, and information integration. Naturally, I am interested in using the Web with its richness and variety of information sources as a testbed for many research ideas. The Web is a perfect example of how various techniques and methods from many database research area can be come together.

To illustrate the point, consider storing a local copy of (almost) the whole Web. Most of the current search engines use ad-hoc methods but the future undoubtedly involves using an industrial-strength data warehouses. In order to make sense of the plethora of information we can use information integration techniques. Thus we can get a comprehensive and coherent picture of the contents of the Web. The data warehouse will contain a structured (relational) part, multimedia part, and semistructured part. In order to make this semistructured part of the data warehouse easy to browse and query, we can use techniques developed for summarization, querying, and indexing of semistructured data. Finally, in order to extract interesting patterns and trends we can use complex data mining queries and new query optimization algorithms to enable their efficient execution.