Hi, all! We thought we'd put all our resources together in a central spot. I'll organize the materials by day.


Day One: Ideas of classification, and using Excel to work with data

There were some questions about why we use n-1 in sample variance and n in variance. This is called Bessel's correction. The Wikipedia article is really interesting, and I'll look for better explanations too!

Day Two: Algorithms and intro to Python programming using Jupyter

First, the algorithms. We are focusing on classification, so support vector machines (the easy version) and decision trees are our primary algorithms. If there is time, we can talk about nearest neighbors or naive Bayes. Let's do out the math by hand and with Excel to get really familiar with these ideas.

Excel is kind of painful here, isn't it? There's got to be a better way! Python to the rescue!

End-of-day evaluation for camp

Day Three: Make Python do all your math for you!

If you really want to do machine learning well, you need to understand the algorithms and their strengths and weaknesses. The math under the hood is really important, and that's what we spent time on Monday and Tuesday. But if you want to do machine learning at all you have to leverage the strengths of modern computing -- that's why it's "machine" learning! We are using scikit-learn here to do all kinds of amazing things, and with this intro you should be able to go to the scikit-learn documentation and do SO MUCH MORE than we could ever cover in a one-week class.

Below, we have links to all notebooks on Github. However, in the Windows lab Google Drive is easier to use so use this great link for today!

Wednesday evaluation

Day four

Here are project instructions. After you look at that, fill out your preferences on this Google form.

Most of the datasets are available via our Google drive as well. Check there. For economic data, you will have to decide on some variables and gather your own -- I've got unemployment in the Drive. Also, the "labeled faces in the wild" set is really big, so find it here.

If you want a template for starting your Python code, use "Template notebook -- rename me" from the Google drive!

Day Four evaluation

Day Five

Drop your presentation here!

Evaluation here.

Links for fun and discussion