MTH6101 - Introduction to Machine Learning - 2023/24
Topic outline
-
-
Full lecture notes plus the lab material for the module are given here. The material is broken down in weeks. We may make *minor* amendments to the material after each week.
51.8 MB -
This file contains all the handwritten notes of the lectures and it will be updated every week. The current version contains all the material up to Week 12. It is complete, that is.
25.6 MB -
This R-file contains all the coding demonstrations that are performed during the lectures and it will be updated every week. The current version contains all the material seen in the Module lectures, i.e. it is complete.
27.6 KB -
There are numerous opportunities for you to gain feedback on your progress:
- Questions during or after the lessons
- Office hours
- Learning Cafe
- Lab sessions
-
If you are finding any aspect of the module challenging or have any other difficulty we are here to help! But note that we cannot help unless we know there is a problem, so make sure you contact Kostas or Hugo.
-
Can machines do what we can do? Machine Learning is a rapidly growing subfield of Artificial Intelligence, at the boundary between Statistics and Computer Science, focusing on teaching machines how to learn by developing their own algorithms without the need for human supervision. The module has theoretical and practical aspects. By the end of it you will have a good understanding of the theoretical basis for machine learning and have gained some hands-on experience in the lab. You will have worked with algorithms such as decision tree learning and classification methods, and statistical methods for analysing big datasets.
-
0 Preliminary topics for Machine Learning
1 Unsupervised learning: Principal Components Analysis
2 Unsupervised learning: Cluster analysis
3 Supervised learning: Classification
4 Supervised learning: Regression and regularization
-
-
-
-
-
This handwritten assessment is available for a period of 3 hours and 30 minutes, within which you must submit your solutions. You may log out and in again during that time, but the countdown timer will not stop. If your attempt is still in progress at the end of your 3 and a half hours, any file you have uploaded will be automatically submitted.
The assessment is intended to be completed within 3 hours. Please note that the additional 30 minutes is to scan and submit your answers. Please ensure that you complete the assessment within 3 hours to prevent any technical issues that may occur if you submit close to the deadline.
In completing this assessment:
• You may use books and notes.
• You may use calculators and computers, but you must show your working
for any calculations you do.
• You may use the Internet as a resource, but not to ask for the solution to
an exam question or to copy any solution you find.
• You must not seek or obtain help from anyone else.
-
-
-
3.9 MB
-
-
-
-
142.7 KB
-
Welcome to week 5!
This week we will be focusing on Agglomerative Clustering, one of the most basic and useful methods to cluster data based on their degree of similarity.
In addition, our school is doing NSS promotion in our Thursday lectures. They will show up with tasty donuts ! Do not miss out your chance to reflect your voice about your education.
-
Welcome to week 6!
This week we will examine the clustering method of k-means in more detail. We will also give some general guidelines for the first mid-term assessment.
-
- The mid-term online quiz will take place on Friday Week 7 at the same time with the labs (i.e. no labs on week 7!):
Friday March 8, 2024, 16:00-18:00 - We will soon make an announcement regarding where you will find the quiz on the module's QMplus page.
- We will have a question session on Wednesday March 6, 15:00-17:00 via this teams channel (password: akzwxmy).
-
Information about the date, the format, the available help and the essential tasks you are expected to do for the first mid-term assessment.
56.7 KB -
Here are some sample questions that will help you during your revision for the first mid-term quiz.
Update March 5: We have now added an additional clustering question on the sample questions.
Update March 6: The file now contains the solutions
195.1 KB -
This quiz assessment is available for a period of 2 hours. Upon accessing the assessment, you will have until the end of the quiz (Friday 08/03/24 at 18:00, local London time) in which to complete and submit it. You may log out and in again during that time, but the countdown timer will not stop. If your attempt is still in progress at the end of the 2-hour period, any answers you have filled will be automatically submitted.
In completing this assessment:
• You may use books and notes.
• You may use calculators and computers, but you must show your working for any calculations you do.
• You may use the Internet as a resource, but not to ask for the solution to an exam question or to copy any solution you find.
• You must not seek or obtain help from anyone else.
- The mid-term online quiz will take place on Friday Week 7 at the same time with the labs (i.e. no labs on week 7!):
-
This week we look at classification, putting emphasis on performance measures to compare classifiers. In particular, we look at the confusion matrix, the ROC graph and the ROC curve.
-
In lectures, this week we survey classifiers, covering the linear model, logistic regression, the k-nearest neighbor classifier and start looking at the classification tree.
In the lab, we will analyze the glass data set. Make sure you download the data set (link below) and that there is a specific change in the code as give in the booklet. The change is
DAT<-data.frame(X,factor(Y)); colnames(DAT)[11]<-"Y"
instead of the line DAT<-data.frame(X,Y) in the booklet. The reason for this is that the tree classifier requires the response defined as a factor.
-
Our survey of classifiers comes to a close with the second part of trees and the linear discriminant classifier. We then start the last topic of this Module: penalized regression. All these classifiers were seen in the lab of week 9, though.
Note that Friday week 10 there will be no labs because of the Easter Bank holiday.
-
As promised, the shiny app. Just run the app in Rstudio and play with it to see the evolution of ridge as function of lambda. You need not edit the code but you may do so if you wish. The code is not particularly complex, it is just the functions for the ridge put inside a wrapper as required by shiny.
4.1 KB
-
-
We continue ridge regression and then turn our attention to the lasso and finally, elastic nets.
Note that the lab for this week contains analyses and comparisons for all three models: ridge, lasso and elastic net. This lab is already available in the booklet.
-
3.6 KB
-
-
- The second mid-term online quiz will take place on Friday Week 12 at the same time with the labs (i.e. no labs on week 12!):
Friday April 12, 2024, 16:00-18:00 - We will soon make an announcement regarding where you will find the quiz on the module's QMplus page.
Concerning Module material, this week we continue our study of regularized regression. We finalize lasso and then survey elastic nets for both regression and for likelihood-based models. This last week bring this Module to a close.-
This quiz assessment is available for a period of 2 hours. Upon accessing the assessment, you will have until the end of the quiz (Friday 12/04/24 at 18:00, local London time) in which to complete and submit it. You may log out and in again during that time, but the countdown timer will not stop. If your attempt is still in progress at the end of the 2-hour period, any answers you have filled will be automatically submitted.
In completing this assessment:
• You may use books and notes.
• You may use calculators and computers, but you must show your working for any calculations you do.
• You may use the Internet as a resource, but not to ask for the solution to an exam question or to copy any solution you find.
• You must not seek or obtain help from anyone else. -
You asked for it ... here is a sample for the midterm test. We'll make solutions available *no earlier* than Wednesday morning/noon.
Update (Wednesday 10 afternoon): the file has solutions as well.
1.8 MB -
-
Here is the material used for the revision session in the last lecture. Note -as I said in the revision lecture- that this is a compilation of the weekly summary boxes of the booklet.
248.6 KB -
The results of the test are now available; here is a report of the results.
183.5 KB
- The second mid-term online quiz will take place on Friday Week 12 at the same time with the labs (i.e. no labs on week 12!):
-
-
-
-
-
-
This file has blank exam and solutions for each of exam 2020 and sample 2020.
548.3 KB -
-