Introduction to Machine Learning
What is machine learning, the machine learning process, the machine learning landscape, machine learning in the real world.
K-nearest neighbours for classification, binary and categorical predictors, k-nearest neighbours for regression, distance functions, how should we choose k.
The Fundamental Limits of Machine Learning
Is learning feasible at all, interpreting the bound, a probabilistic setting, when is machine learning feasible.
Motivation, exact Bayes classifiers, the Laplace Estimator, Bayes’ Theorem, naïve Bayes classifiers.
Evaluating Predictive Performance (I)
Which fit is “right”, test set, validation set, the “training set – validation set – test set” approach.
Classification and Regression Trees
Classification trees, choosing the best split: Part 2, regression trees, random forests and boosting algorithms, choosing the best split: Part 1, pruning a classification tree, bagging.
Evaluating Predictive Performance (II)
Performance measures for regression, lift charts for classification problems, problems (use confusion matrix), lift charts for regression problems.
Motivation, hierarchical clustering is myopic, practical concerns of a cluster, hierarchical clustering, k-means clustering, analysis.
Evaluating Predictive Performance (III)
Oversampling, k-fold cross-validation.