Projects
Introduction to Machine Learning - 10-701/15-781
Basics
The project nets you one third of the course credits. On top of that,
it is a good opportunity to find out whether you like machine
learning. Since this is research, there's no guarantee, that the
project will actually succeed. What matters most is that you try
solving the problem using a scientific approach.
Maximum team size is 4, and a typical team should have 3
members. Two are OK but it is in your interest to join a larger team
since you will be able to accomplish more.
The project is comparable to an academic paper that you might send
to a journal. That is, it isn't sufficient to just run a few MATLAB
scripts and plot a bunch of figures.
In terms of topics, I am happy with both implementations which build
systems processing considerable amounts of data, novel algorithms or
proofs and new theory. It is preferable if your designs pass the
scalability test. In other wods, it's perfectly OK to prove theorems
if they relate to a new algorithm which is scalable over many
machines. Obviously if you have a mix of all three things it's
best. A perfect score would be work of the level that can get into a
tier 1 or 2 conference.
Use the office hours to get feedback and advice on your project. It is
in your interest to do this. We will give you suggestions on how to
solve problems, feedback on whehter a project will succeed, and help
you with ideas.
Proposal
For the project proposal each team needs to prepare a one-page
report stating the problem, who will participate, and what the
expected outcomes will be. It is due on February 11, 2013.
Midterm evaluation
For the midterm project evaluation each team needs to prepare a
one-page report stating the problem and what has been achieved so
far. It is due on March 6, 2013, i.e. two days after the midterm.
Report
A written report using the
ACM Style Guide. At
least 4 pages of two-column documentation and no more than 10
pages.
The report should include pointers to code, data, etc. such that the work is
sufficiently reproducible.
You need an abstract, introduction, a discussion of related work, a
description of the main idea, a description of the data,
experiments, and a summary.
Symbols must be defined before being used (or, at least, within
the same paragraph if it is inevitable). The human mind works
like a compiler in this case. Compiler errors are bad for your grade. Just because it looks pretty doesn't mean it is pretty.
You need to be precise in the main body of the paper. The
introduction can be used to provide the intuition.
Posters
A poster (between A1 and A0 size) for the poster presentation. See
e.g. here for a style file. We will
make a more concise style file available in due course.
If your project is very good, you will get the opportunity to
present it in class. Only the six best posters will get this
chance. There will be spotlights for other posters. Note that just
like in a conference, there is correlation between your score
and the amount of exposure but not a direct mapping.
The Stanford ML class with Andrew Ng did a great job. Let's match this!
Heilmeier's criteria
You should be able to address
Heilmeier's criteria,
as adapted for the purpose of this class. This type of reasoning will
help you with choosing your own research agenda, writing grants,
convincing colleagues, securing VC funding, and writing papers. So
it's good practice.
What are you trying to do? Articulate your objectives using absolutely no jargon.
How is it done today, and what are the limits of current practice?
What's new in your approach and why do you think it will be successful?
Who cares? If you're successful, what difference will it make?
What are the risks and the payoffs?
How long will it take and what have you achieved so far?
How will you determine success?
|