Amit Chandel



home
about me
academics
projects
resume
research
pics
miscellaneous
contact



Curriculum Vitae    [ pdf ]

PERSONAL DETAILS

Mailing Address: Department of Computer Science,
Room# 3302, 10 King's College Road
Toronto, ON M5S 3G4, Canada
E-MAIL: amit [AT] cs [DOT] toronto [DOT] edu
WEB: http://queens.db.toronto.edu/~amit/

AREAS OF INTEREST

Data Cleaning, Fuzzy Matching, Database Systems, Information Retrieval and Data Mining

EDUCATION

2005-2007 Master of Science,
Advisor: Prof Nick Koudas
Department of Computer Science,
University of Toronto, Toronto, Canada
2001-2005 Bachelor of Technology,
Advisor: Prof Sunita Sarawagi
Department of Computer Science and Engineering,
Indian Institute of Technology, Bombay, INDIA
2001 Senior Secondary School Certificate,
Emmanuel Mission School, Kota, INDIA

PUBLICATIONS

Conferences
Matei A. Zaharia, Amit Chandel, Stefan Saroiu, and Srinivasan Keshav,
Finding Content in File-Sharing Networks When You Can't Even Spell,
To appear in the Proceedings of the Sixth International Peer-to-Peer Workshop (IPTPS), Bellevue, WA, February 2007.

Amit Chandel, Nick Koudas, Ken Pu and Divesh Srivastava,
Fast Identification of Relational Constraint Violations,
To appear in the 23rd IEEE Int'l Conference on Data Engineering (ICDE), 2007.

Amit Chandel, P.C. Nagesh, and Sunita Sarawagi,
Efficient batch top-k search for dictionary-based entity recognition,
In Proc. of the 22nd IEEE Int'l Conference on Data Engineering (ICDE), 2006.

Technical Reports
Batch Top-k Searches on Text Columns, Advisor: Prof. Sunita Sarawagi
Senior Undergraduate Thesis Report, IIT Bombay, May 2004.

Summarizing Tree Structured XML Data Quantitatively
Amit Chandel, Nilesh Bansal, Laks V. S. Lakshmanan and Raymond T. Ng
Technical Report, University of British Columbia, Vancouver, July 2004

Keyword Search in Databases, Advisor: Prof. S. Sudarshan
Junior Undergraduate Thesis Report, IIT Bombay, April 2004.

RESEARCH EXPERIENCE

SPIDER, July 2006 - Jan 2007
Advisor: Prof. Nick Koudas, University of Toronto

Data quality is a serious concern in every organization that relies on data. The quality of data is commonly poor due to a multitude of reasons including, but not limited to, spelling mistakes, abbreviations, lack of standards and inconsistent notations.
SPIDER is a declarative data cleaning tool. It incorporates a set of algorithms that can be used to aid the improvement of data quality on any relational data source SPIDER can be used for flexible querying, approximate joins, schema matching and data exploration.


Fast Identification of Relational Constraint Violations, Jan - July 2006
Advisor: Prof. Nick Koudas, University of Toronto

Built and maintained specialized BDD-based logical indices on the relational tables and described query re-write rules for efficient utilization of logical indices to quickly identify the violating relational constraint. Implemented this approach in C++ on top of Postgres database using ODBC and tested it on large collections of real and synthetic data sets.


Efficient Batch Top-k Search for Dictionary-based Entity Recognition, Aug 2004 - Aug 2005
Advisor: Prof. Sunita Sarawagi, IIT Bombay

We consider the problem of speeding up Entity Recognition systems that exploit existing large databases of structured entities to improve extraction accuracy. These systems require the computation of the maximum similarity scores of several overlapping segments of the input text with the entity database. We formulate a Batch-Top-K problem with the goal of sharing computations across overlapping segments. Our proposed algorithm performs a factor of three faster than independent Top-K queries and only a factor of two slower than an unachievable lower bound on total cost. We then propose a novel modification of the popular Viterbi algorithm for recognizing entities so as to work with easily computable bounds on match scores, thereby reducing the total inference time by a factor of eight compared to stateof- the-art methods.


Data Integration from Web-Pages, Feb 2005 - Apr 2005
Advisor: Prof. Soumen Chakrabarti, IIT Bombay

Designed a technique to extract publication entries from web-pages and storing these entries into a structured database. The creation of structured database is performed in two steps: first step identifies individual publication entry and second step performs fine grained information extraction. For the first step, we implemented a classifier on DOM nodes, while for the second step we implemented an efficient inference algorithm using A* technique.


Network Intrusion Detection using Stide Methodology, Feb 2005 - Apr 2005
Advisor: Prof. Sunita Sarawagi, IIT Bombay

Designed an intelligent system to automatically detect possible events of network intrusion. The system monitored network logs generated by tcpdump (per-packet activity) for anomalies and raised a flag whenever observed behavior deviated significantly from normal. We employed stide-methodology for classifying, where we used sequences of consecutive log-records (over a sliding window of fixed size) to represent activity. The basic approach is to construct a normal dictionary from data collected when there was no intrusion. This dictionary is used to compute anomaly count of incoming log-data. Stide-methodology has been previously shown to be effective in system intrusion detection problems. We proposed a novel encoding scheme for sequences in network activity log that enabled us to use same technique in this domain as well. Our results were verified by experimenting on real world datasets.


Summarizing Tree Structured XML Data Quantitatively, May - Nov 2004
Advisor: Prof. Laks Lakshmanan and Prof. Raymond Ng, UBC, Vancouver

We have developed an algorithm for constructing a summary of an XML document to discover the structural aspect of its schema, and to use the summary for other tasks like - query result size estimation, structural compression and exploration. The summary is capable of preserving various kinds of quantificational information, which can be used to extract knowledge on number of edges or paths following a certain label pattern.


Managing Database Snapshots in Mobile Environment, Aug - Nov 2004
Advisor: Prof. Krithi Ramamritham, IIT Bombay

We designed methods and tools to assist the building of database applications to be used on mobile devices keeping in view their frequent communication breakdowns. The key idea is to maintain partial weakly consistent view of the central database on the mobile device during disconnectivity and synchronize the data when the connection is available.


TALKS AND SEMINARS

Keyword Search in Databases, April 2004
Dept. of CSE, IIT Bombay

Presented an in-depth study of various systems like DBXplorer, DISCOVER, DTL's DataSpot, BanKS and XRank, which enable keyword search over relational databases highlighting important features like relavence ranking and proximity search.


Stock Market Prediction Using Neural Networks, March 2004
Dept. of CSE, IIT Bombay

Presented a talk to demonstrate the use of Back Propagation Neural Networks for stock market prediction with an overview of Back Propagation NN and the design of the prediction model.


SELECTED PROJECTS

IITB Navigator, Aug - Nov 2003
Advisor: Prof. S. Sudarshan, IIT Bombay

Developed a GUI with web front-end to locate different people, places and locations of various ongoing events in a region, showing the shortest path to the destination on a map.


CMS: Course Management System, May - Dec 2003
Advisor: Prof. S. Sudarshan, IIT Bombay

Provided a common web based interface between instructors, students and teaching assistants in an institute for doing mundane tasks such as giving and submitting assignments, assigning projects and demo scheduling, course information, notices, grading and messages. Implemented using servlets, JDBC, SQL and Java.


TECHNICAL SKILLS

Languages: Java (JSP, JDBC, Servlets, Swing) C/C++, Scheme, Fortran, Pascal
Scripting: Unix Shell, Perl/CGI, PHP, Python
Databases: Oracle, PostgreSQL, MySQL, XML(Xerces, DOM)
Tools: Eclipse, CVS, SVN, Web (HTML, Javascript, AJAX)
Platforms: Linux, Windows

ACADEMIC HONORS

* Selected for the University Fellowship Program at University of Toronto, Toronto (Sep. 2005). * Selected for the summer internship program at the University of British Columbia, Vancouver, BC, Canada (May 2004).
* Awarded the Institute Scholarship for Academic Excellence (2001) by Indian Institute of Technology, Bombay.

REFERENCES

Available upon request.
2006 Amit Chandel