NIPS 2002 Workshop

Unreal Data: Principles of Modeling Nonvectorial Data


A large amount of research in machine learning is concerned with classification and regression for real-valued data which can easily be embedded into a Euclidean vector space. This is in stark contrast with many real world problems, where the data is often a highly structured combination of features (e.g., natural language and speech processing), a sequence of symbols (e.g., bioinformatics), a mixture of different modalities, may have missing variables, etc. The items in non-vectorial data sets can be one dimensional structures (e.g. sequences), two dimensional (e.g. images), three dimensional (e.g. molecular descriptions), trees (e.g. xml documents), or other hybrid and not-so-easily classified data structures.

To address the problem of learning from non-vectorial data, various methods have been proposed, such as embedding the structures in Hilbert spaces (e.g., via Kernels), the extraction and selection of features, proximity based approaches, parameter constraints in Graphical Models, Inductive Logic Programming, Decision Trees, or clever hand-crafted models.

Aims of this workshop: The goal of this workshop is twofold. Firstly, we hope to make the machine learning community aware of the problems arising from domains where non-vectorspace data abounds and to uncover the pitfalls of mapping such data into vector spaces. Secondly, we will try to find a more uniform structure governing methods for dealing with non-vectorial data and to understand what, if any, are the principles underlying the modeling of non-vectorial data.

Date and Location

The workshop will be held in Whistler, British Columbia (Canada) on Friday, December 13th 2002.

Schedule and List of Speakers

Morning Session

  • 07.30-08.15 Thore Graepel Getting Real with Unreal Data: Lessons Learned and the Way Ahead

  • 08.15-08.55 Fernando Pereira Undirected graphical models for sequence analysis

  • 08.55-09.10 Coffee break

  • 09.10-09.50 Koji Tsuda Marginalized Kernels for Biological Sequences

  • 09.50-10.30 Mehryar Mohri Algorithmic Challenges for Speech Mining

Afternoon Session

  • 16.00-16.40 Zoubin Ghahramani Graphical Models for Non-vectorial Data

  • 16.40-17.20 Alan Yuille The Structure in Computer Vision Problems

  • 17.20-17.35 Coffee break

Contributed Talks

  • 17.35-17.55 Erik Miller Practical Non-parametric Density Estimation on a Transformation Group for Vision

  • 17.55-18.15 Thomas Gärtner Exponential and Geometric Kernels for Graphs

  • 18.15-18.35 S.V.N. Vishwanathan Kernels on Automata

  • 18.35-19.00 Paolo Frasconi Comparing convolutional kernels and recursive networks

Getting in Touch

If you are interested in contributing or have comments please send an e-mail to any one of the organizers or fax to +61-2-612-58650 (Alex Smola or Gunnar Rätsch)