ElefantElefant (Efficient Learning, Large-scale Inference, and Optimization Toolkit) is a Python open source library for machine learning licensed under the Mozilla Public License. The aim is to develop an open source machine learning platform which will become the platform of choice for prototyping and deploying machine learning algorithms.
This toolkit is the common platform for software development in the machine learning team in NICTA. Not all the tools are currently released but many can be found in the developers version with SVN access.
Bundle Methods Solver
BMRM is an open source, modular and scalable convex solver for many machine learning problems cast in the form of regularized risk minimization problem. It is modular because the (problem-specific) loss function module is decoupled from the (regularization-specific) optimization module (e.g. quadratic programming or linear programming solvers), thus shorten the time to implement/prototype solutions to new problems. Besides, the decoupling leads to easier parallelization of the loss function computation. At the moment it solves the following problems:
- Binary classification
Soft-margin, Squared Soft-margin, Huber-hinge, Logistic regression, Exponential, ROC Score, F-beta Score
- Univariate regression
epsilon-insensitive, Huber robust, Least Mean Squares, Least Absolute Deviation
- Novelty detection (1-class SVM)
- Quantile regression
- Poisson regression
NDCG (normalized discounded cummulative gain), Ordinal regression
Collaborative FilteringCofirank is our collaborative filtering solver. It is built on top of the bundle method solver. The goal is to predict preferences of users based on past ratings by them and other users. We build upon the approach of Maximum Margin Matrix Factorization, yet extend it in several ways:
- Cofi can make use of state of the art optimization technology, making it feasible to run on the largest data sets available. Cofi can be run on a single machine with modderate memory requirements (2GB) to train on the Netflix dataset with its 100.000.000 entries.
- Cofi is able to do structured predition, e.g. by predicting the relative order with which you like movies instead of the absolute rating you would give them. This allows for models that are better suited to predict what you like than dislike. This property is important for recommender systems.
- Cofi can be parallelized easily to take advantage of multi core machines or clusters of workstations.
- In addition, Cofi has some desirable properties which stem from MMMF: it does not need explicit features of the items or users. However, it can use them whenever available.
SVLabThis is an old Matlab toolbox that I wrote mainly for my own purpose. Some early versions have been released and the code is now completely unmaintained. It may not even run on a recent version of MATLAB any more. I abandoned the work after the Mathworks increased the price for a license by a factor of 10 because they decided to classify NICTA as an industrial research lab.
Please do not ask me for help with bugfixes or instructions. The code is provided as is. It is released under the GNU Public License.