## WorkshopsSIGIR 2010: Feature Generation and Selection for Information Retrieval NIPS 2007: Representations and Inference on Probability Distributions NIPS 2002: Unreal Data — Principles of Modeling Nonvectorial Data
## SIGIR 2010: Feature Generation and Selection for Information RetrievalEvgeniy Gabrilovich, Yahoo! Research Alex Smola, Yahoo! Research Tali Tishby, Hebrew University
Modern information retrieval systems facilitate information access at unprecedented scale and level of sophistication. However, in many cases the underlying representation of text remains quite simple, often limited to using a weighted bag of words. Over the years, several approaches to automatic feature generation have been proposed (such as Latent Semantic Indexing, Hashing, or Latent Dirichlet Allocation), yet their application in large scale systems still remains the exception rather than the rule. On the other hand, numerous studies in NLP and IR resort to manually crafting features, which is a laborious and often computationally expensive process. Such studies often focus on one specific problem, and consequently many features they define are task- or domain-dependent. Consequently little knowledge transfer is possible to other problem domains. This limits our understanding of how to reliably construct informative features for new tasks. ## NIPS 2009: Large-Scale Machine LearningCarlos Guestrin, CMU Alex Gray, Georgia Tech Alex Smola, Yahoo! Research Arthur Gretton, CMU Joseph Gonzalez, CMU
Physical and economic limitations have forced computer architecture towards parallelism and away from exponential frequency scaling. Meanwhile, increased access to ubiquitous sensing and the web has resulted in an explosion in the size of machine learning tasks. In order to benefit from current and future trends in processor technology we must discover, understand, and exploit the available parallelism in machine learning. This workshop will achieve four key goals: Bring together people with varying approaches to parallelism in machine learning to identify tools, techniques, and algorithmic ideas which have lead to successful parallel learning. Invite researchers from related fields, including parallel algorithms, computer architecture, scientific computing, and distributed systems, who will provide new perspectives to the NIPS community on these problems, and may also benefit from future collaborations with the NIPS audience. Identify the next key challenges and opportunities to parallel learning. Discuss large-scale applications, e.g., those with real time demands, that might benefit from parallel learning.
## NIPS 2007: Representations and Inference on Probability DistributionsKenji Fukumizu, Institute of Statistical Mathematics Alex Smola, Yahoo! Research Arthur Gretton, CMU
When dealing with distributions it is in general infeasible to estimate them explicitly in high dimensional settings, since the associated learning rates can be arbitrarily slow. On the other hand, a great variety of applications in machine learning and computer science require distribution estimation and ## NIPS 2004: Graphical Models and KernelsAlex Smola, Yahoo! Research Ben Taskar, U Pennsylvania Alex Smola, Purdue University
Graphical models provide a natural method to model variables with structured conditional independence properties. They allow for understandable description of models, making them a popular tool in practice. Kernel methods excel at modeling data which need not be structured at all, by using mappings into high-dimensional spaces (also popularly called the kernel trick). The popularity of kernel methods is primarily due to their strong theoretical foundations and the relatively simple convex optimization problems. Recent progress towards a unification of the two areas has seen work on Maximum Margin Markov Networks, structured output spaces, and kernelized Conditional Random Fields. Some work has also been done on using fundamental properties of the exponential family of probability distributions to establish links. The aim of this workshop is to bring together researchers from both the communities together in order to facilitate interactions. More specifically, the issues we want to address include (but are not limited to), the fundamental theory linking these fields. We want to investigate connections using exponential families, conditional random fields, Markov models etc. We also wish to explore the applications of the kernel trick to graphical models and study the optimization problems which arise out of such a marriage. Uniform convergence type results for theoretically bounding the performance of such models will also be discussed. ## NIPS 2002: Unreal Data — Principles of Modeling Nonvectorial DataZoubin Ghahramani, Cambridge University Gunnar Raetsch, Friedrich Miescher Laboratory Alex Smola, Yahoo! Research
A large amount of research in machine learning is concerned with classification and regression for real-valued data which can easily be embedded into a Euclidean vector space. This is in stark contrast with many real world problems, where the data is often a highly structured combination of features, a sequence of symbols, a mixture of different modalities, may have missing variables, etc. To address the problem of learning from non-vectorial data, various methods have been proposed, such as embedding the structures in some metric spaces, the extraction and selection of features, proximity based approaches, parameter constraints in Graphical Models, Inductive Logic Programming, Decision Trees, etc. The goal of this workshop is twofold. Firstly, we hope to make the machine learning community aware of the problems arising from domains where non-vectorspace data abounds and to uncover the pitfalls of mapping such data into vector spaces. Secondly, we will try to find a more uniform structure governing methods for dealing with non-vectorial data and to understand what, if any, are the principles underlying the modeling of non-vectorial data. ## ICANN 1999: Gaussian Processes and Support Vector MachinesCarl Rasmussen, Cambridge University Roderick Murray-Smith, University of Glasgow Alex Smola, Yahoo! Research Chris Williams, University of Edinburgh
This workshop aims to bring together people working with Gaussian Process (GP) and Support Vector Machine (SVM) predictors for regression and classification problems. We will open with tutorial-like introductions to the basics so that researchers new to the area can gain an impression of the applicability of the approaches, and will follow with contributed presentations. The final part of the workshop will be an open discussion session. We would bring laptops to provide some software demos, and would encourage others to do the same. ## EUROCOLT 1999: Kernel MethodsJohn Shawe-Taylor, University College London Bernhard Schoelkopf, MPI Tuebingen Alex Smola, Yahoo! Research Bob Williamson, NICTA and ANU
We are hosting a one day informal workshop on Sunday 28th March at Nordkirchen Castle, Germany, on the Sunday before the EuroCOLT’99 conference; particular interest of the organisers is the analysis of Kernels and Regularization and this will be one of the themes of the workshop. The aim is to provide a meeting venue for those who are attending both the Dagstuhl meeting on unsupervised learning, ending on the 26th, and the EuroCOLT conference, starting on the 29th. Those not attending the Dagstuhl meeting are of course very welcome to participate, too. If you wish to attend, consider arriving on the Saturday evening when there will be a meeting to arrange the format of the day. ## NIPS 1998: Large Margin ClassifiersPeter Bartlett, UC Berkeley Dale Schuurmans, U Alberta Bernhard Schoelkopf, MPI Tuebingen Alex Smola, Yahoo! Research
Many pattern classifiers are represented as thresholded real-valued functions, eg: sigmoid neural networks, support vector machines, voting classifiers, and Bayesian schemes. There is currently a great deal of interest in algorithms that produce classifiers of this kind with large margins, where the margin is the amount by which the classifier's prediction is to the correct side of threshold. Recent theoretical and experimental results show that many learning algorithms (such as back-propagation, SVM methods, AdaBoost, and bagging) frequently produce classifiers with large margins, and that this leads to better generalization performance. Hence there is sufficient reason to believe that Large Margin Classifiers will become a core method of the standard machine learning toolbox. ## NIPS 1997: Support Vector MachinesLeon Bottou, NEC Research Chris Burges, Microsoft Research Bernhard Schoelkopf, MPI Tuebingen Alex Smola, Yahoo! Research
The Support Vector (SV) learning algorithm (Boser, Guyon, Vapnik, 1992; Cortes, Vapnik, 1995; Vapnik, 1995) provides a general method for solving Pattern Recognition, Regression Estimation and Operator Inversion problems. The method is based on results in the theory of learning with finite sample sizes. The last few years have witnessed an increasing interest in SV machines, due largely to excellent results in pattern recognition, regression estimation and time series prediction experiments. The purpose of this workshop is (1) to provide an overview of recent developments in SV machines, ranging from theoretical results to applications, (2) to explore connections with other methods, and (3) to identify weaknesses, strengths and directions for future research for SVMs. We invite contributions on SV machines and related approaches, looking for empirical support wherever possible |