Albert Angel
 
Contact:
Office: #5160 Bahen Centre for Information Technology
40 St.George Str., Toronto, ON M5S 2E4, Canada
 
 
About
About

I am currently a PhD Candidate at the Department of Computer Science at the University of Toronto, working with Prof. Nick Koudas.

My main research interests lie in the intersection of Database Systems and Information Retrieval. I am partial to problems dealing with ranking structured information, in the context of user generated content, such as:

  • Text mining
  • Keyword search on structured (graph) data
  • Diversified information retrieval
  • Top-k query processing
Research
Research

Motivation

Using services such as blogs, micro-blogs, and online social-networks, people around the globe generate millions of documents (posts, articles, status updates, etc.) every day.

This user-generated content is more than just a massive stream of text. On many dimensions, it embeds a significant amount of structural information, such as social links, mentions of real-world entities (people, locations, products, etc.), user profile information, etc., giving rise to large graph structures.

Exploiting this structure, as opposed to considering the textual aspect of user-generated content alone, enables valuable applications, as diverse as:

  • Real-time news identification
  • Understanding the zeitgeist (e.g. which current events is a target demographic group most engaging with ?)
  • Explaining the output of recommendation systems
  • Entity or product search, based on public opinions thereof (e.g. which hotels are considered ideal for family vacations?)

and so on. At the core of such computations, is the necessity of efficiently and effectively prioritizing/ranking information.

Research

My research focuses on devising efficient algorithms to enable such computations, i.e. ranking in the context of user-generated content.

Existing algorithmic techniques, such as the Threshold Algorithm, can be adapted to the structured setting of user-generated content, by interleaving richer computations into the algorithm's core.

At the same time, traditional assumptions need to be re-examined, such as how to schedule data accesses for optimal performance.

In my work, I strive for balance between designing novel algorithms, analytically exploring their properties, and experimentally validating them.

Grapevine
What's on the Grapevine?

Grapevine is a system that conducts large scale data analysis on the social media collective, extracting information in real time. The goal of Grapevine is to distill information and provide insights, by capturing popular trends as they emerge.

Grapevine facilitates the interactive exploration of content, allowing users to discover interesting or surprising stories, optionally narrowed down on a specific demographic of interest (e.g. "What are Torontonian teens talking about on blogs?"). Stories of interest can be explored in a variety of ways, such as modifying their scope (e.g. "How is Barack Obama related to this story?"), viewing related content (blog posts, news, videos, etc.), and examining their temporal evolution.

Grapevine, currently formerly live at www.onthegrapevine.ca, was developed in collaboration with fellow graduate student Nikos Sarkas. Supporting this functionality has led us to consider exciting research questions, such as

  • How to identify high-impact stories, across all demographics, in real time
  • How to present a diverse set thereof to the user
  • How to understand and present the temporal evolution of a story
  • How to provide real-time trends to the user, exploiting named entities, and any hierarchical information they carry
  • How to facilitate the exploration of related stories, by recommending related ones, and explaining these recommendations
Publications
Publications

Chronological list of publications (see also DBLP)

Conferences

Dense Subgraph Maintenance under Streaming Edge Weight Updates for Real-time Story Identification [Paper | Tech. Report], in VLDB 2012
Albert Angel, Nick Koudas, Nikos Sarkas, Divesh Srivastava

Efficient Diversity-Aware Search [Paper | Tech. Report], in SIGMOD 2011
Albert Angel, Nick Koudas

Efficient Identification of Coupled Entities in Document Collections, in ICDE 2010
Nikos Sarkas, Albert Angel, Nick Koudas, Divesh Srivastava

Ranking Objects Based on Relationships and Fixed Associations [Paper | Tech. Report ], in EDBT 2009
Albert Angel, Surajit Chaudhuri, Gautam Das, Nick Koudas

What's on the Grapevine ? [Demo Paper], in SIGMOD 2009
Albert Angel, Nick Koudas, Nikos Sarkas, Divesh Srivastava

Qualitative Geocoding of Persistent Web Pages [Paper], in ACM GIS 2008
Albert Angel, Alexandros Efentakis, Chara Lontou, Dieter Pfoser

More coming soon... (currently under submission)

Theses

MSc Research Paper (in lieu of MSc Thesis), University of Toronto
Supervisor: Prof. Nick Koudas
Available upon request (currently under submission)

Geographic Information Extraction from Text [abstract | full text (in Greek)]
Diploma Thesis, National Technical University of Athens
Supervisor: Prof. Timos Sellis. Co-supervisor: Dieter Pfoser

More
More
In the context of my graduate studies, I have also taken a number of courses, and assisted in the teaching of others. I have strived to balance my research with my personal life, and have been fortunate to have a number of great colleagues.
 
The views expressed herein are solely those of the author, and do not necessarily reflect the views of the University of Toronto, or the Department of Computer Science thereof.
Last Updated: Apr, 2012