Table of Contents
Data Mining Techniques for Structured and Semistructured Data
Abundance of Data
Data Mining
Talk Outline
Query Flocks: Efficient, On-line, Ad-hoc Mining of Structured Data
Mining Structured Data
Query Flock Features
Query Flock Roadmap
Association-Rule Mining
Mythical Association Rule
Market Baskets as Query Flock
Query Flock Result
Formal Definition of Flocks
Association-Rule Challenge
The A-Priori Technique
A-Priori for Query Flocks
Query Flock Roadmap
Medical Example: Side Effects
Side-Effect Query Flock
Some Safe Subqueries
Processing Flocks Efficiently
Query Flock Plans
Auxiliary Relations
Generating Flock Plans
Example Query Flock Plan
Side Effects Directly in SQL
Typical Direct Plan in RDBMS
Why Do Flocks?
Query Flock Roadmap
Query Flock Architecture
Query Flock Compiler
Flock Compiler Architecture
Performance: Medical Data
Structure Discovery:Mining Semistructured Data
Semistructured Data: Example
Semistructured Data: Definition
Semistructured Data: Data Model
Semistructured Data: Challenges
Benefits of Explicit Structure
Research Contributions
Representative Objects (RO)
RO Construction Algorithm
Semistructured Data: Example
RO Example
RO Features
Approximate Schema: Challenges
Approximate Schema: Our Solution
Notation
Notation Example
Typing Program: Definition
Typing Program: Example
Defect: Excess and Deficit
Typing Program: Construction
Stage 1: Perfect Types
Stage 2: Clustering Types
Clustering Types: Example
Semistructured Data: Example
Stage 3: Recasting Objects
Optimal Typing
Approximate Schema: Features
Research Contributions
Future Research Interests
Acknowledgements
|