| |
 |
| | | | | |
 |
|
|
 |
| |
| Project Description |
| |
| |
The world today is full of information sources, all with their own ways of representing data. One common problem that arises is that data, which exists in one representation in some data source, is needed in a different representation for some other purpose. As a simple example, the owner of a data source may want to publish her data using a specific XML DTD, though it is stored in some different (legacy) format. As another example, data warehouses bring data from one or more sources together, in a new form that allows for efficient decision support queries. Today, such situations are for the most part dealt with manually, by an expert user who has knowledge of both the source and target representations. Converting from one data representation to another is a time-consuming and labor intensive project, with few tools available to ease the task. |
| |
| | The Clio project is a joint project between the IBM Almaden Research Center and the University of Toronto begun in 1999. Clio's goal is to radically simplify information integration, by providing tools that help in automating and managing one challenging piece of that problem: the conversion of data between representations. Clio pioneered the use of schema mappings, specifications that describe the relationship between data in two heterogeneous schemas. From this high-level, non-procedural representation, Clio can automatically generate either a view, to reformulate queries against one schema into queries on another for data integration, or code, to transform data from one representation to the other for data exchange.
| | |
| | Supported by an IBM University Partnership Award, a National Science Foundation CAREER award, a Presidential Early Career Award for Scientists and Engineers (PECASE), and NSERC. |
| |
| People |
| |
|
| |
| Selected Publications |
| |
-
Clio: Schema Mapping Creation and Data Exchange
New in 2009: retrospective on the Clio project.
Ron Fagin, Laura Haas, Mauricio Hernández, Renée J. Miller, Lucian Popa, Yannis Velegrakis.
To appear in book Conceptual Modeling: Foundations and Applications, editors Alexander Borgida, Vinay Chaudhri, Paolo Giorgini and Eric Yu, Springer 2009.
-
Creating Nested Mappings with Clio (Demonstration)
Mauricio Hernández, Howard Ho, Lucian Popa, Ariel Fuxman, Renée J. Miller, Takeshi Fukuda and Paolo Papotti.
In Proceedings of the International Conference on Data Engineering (ICDE), 2007.
-
Nested Mappings: Schema Mapping Reloaded
Ariel Fuxman, Mauricio Hernández, Howard Ho, Renée J. Miller, Paolo Papotti and Lucian Popa.
In Proceedings of the International Conference on Very Large Data Bases (VLDB), 2006.
- Data Exchange: Semantics and Query Answering
Ron Fagin, Phokion Kolaitis, Renée J. Miller and Lucian Popa
In Proceedings of the International Conference on Database Theory (ICDT),
2003.
- Translating Web Data
Lucian Popa, Yannis Velegrakis, Mauricio Hernández, Renée J. Miller and Ron Fagin.
In Proceedings of the 28th International Conference for Very
Large Databases (VLDB), 2002.
- Translating Web Data
Lucian Popa, Yannis Velegrakis, Renée J. Miller, Mauricio Hernández
and Ron Fagin
Technical Report CSRI 441, University of Toronto, 2002.
- Mapping XML and Relational Schemas with CLIO
Lucian Popa, Mauricio A. Hernández, Yannis Velegrakis and Renée J. Miller
System Demonstration, IEEE Data Engineering Conference, 2002.
- Data-Driven Understanding and Refinement of Schema Mappings
Ling-Ling Yan, Renée J. Miller, Laura M. Haas and Ron Fagin.
In Proceedings of the ACM SIGMOD International Conference, 2001.
- Clio: A Semi-Automatic Tool For Schema Mapping.
Mauricio Hernández, Renée J. Miller and Laura M. Haas.
System Demonstration, ACM SIGMOD International Conference, 2001.
- The Clio Project: Managing
Heterogeneity
Renée J. Miller, Mauricio Hernández, Laura M. Haas, Ling-Ling Yan, C. T.
Howard Ho, Ron Fagin and Lucian Popa.
In SIGMOD Record, 2001, 30(1): 78-83.
- Schema Mapping as Query Discovery
Renée J. Miller, Laura M. Haas and Mauricio Hernández.
In Proceedings of the International Conference on Very Large Databases (VLDB), 2000, 77-88.
- Transforming Heterogeneous Data with Database Middleware: Beyond Integration
Laura M. Haas, Renée J. Miller, B. Niswonger, Mary Tork Roth, Peter M.
Schwarz and Edward L. Wimmers.
IEEE Data Engineering Bulletin 1999, 22(1):31-36.
- Using Schematically Heterogeneous Structures
Renée J. Miller.
In Proceedings of the ACM SIGMOD International Conference on the Management of Data, 1998, 27(2):189-200.
|
| |
| |
| Test Schemas |
| | Here is a list of some of the schemas that have been used to test Clio. We are making them publically available to help in comparing schema integration and schema mapping solutions. Nested schemas are presented in XML-Schema. Relational schemas are given as DB2 DDL statements and/or XML-Schemas for convenience. |
| |
| |
|
| |
| Clio@Almaden |
| | A few screen shots are included on IBM Almaden's site (here).
For a demo or information on code availability please contact Howard Ho (lastname @ almaden.ibm.com). |
| |
|
 |
|