MTH5120 - Statistical Modelling I - 2023/24
Topic outline
-
-
v3 May 2024 p35 updated for MLE of normal with betas instead of mu
632.9 KB
-
Welcome to Statistical Modelling 1
This week there are two lectures on campus both in the Peoples Palace Great Hall. These are Monday at 2pm and Thursday at 11am. In the first lecture we will explain in full how the module teaching will run. This will be a bit different from the 3 lectures + 1 tutorial you have been used to in other modules. To access all you need for this course you will need 3 types of teaching. It will be very important to keep up with the material as we go along as each week builds upon the previous ones.
1. the two hours of in person lectures
2. watching a number of short videos each week - these will be clearly signposted on QM Plus. Each video will be up to 15 minutes long. All of the videos will combine to cover the mathematics and methods of statistical modelling covered in this course. It is very important you watch these videos as well as come to timetabled activities. This is a new way of sharing the material which should both give you all you need in accessible, short video format and provide the foundation for a more interactive style of lecture and IT lab.
3. [from week 2] a one hour IT lab focused on modelling in R
This week we will introduce statistical modelling and cover the Simple Linear Regression Model which we will spend the first four weeks of the module constructing and analysing.
-
-
Typed notes to accompany the week 1 lecture material. At the end of the module we will add a single PDF document with full notes for the module.
825.6 KB -
principles of statistical modelling 1 of 2
-
principles of statistical modelling 2 of 2
-
simple linear regression model 1 of 2
-
simple linear regression model 2 of 2
-
least squares estimation 1 of 2
-
least squares estimation 2 of 2
-
106.1 KB
-
35.6 KB
-
Lecture Example data for use in R
489 bytes -
-
-
-
628.6 KB
-
assessing the model plus details of the task you will need to complete this week
402.0 KB -
Please use this submission point to upload a CSV or Excel file with the data you have found yourself that you would like to use for simple linear regression modelling. You will need this data for the first assessed coursework in week 4. There are no marks for a "good" data set or topic so please choose something you are interested in that works in a response variable (y) and explanatory variable (x) format with 10 - 50 pairs of observations and where x is not years.
-
Once you have added a CSV or Excel file to the submission point above please answer the 4 short questions linked here. These will not be graded but you will need to answer them before you complete the first assessed coursework with counts for 15% of the module mark and will be set in week 4.
-
Our World in Data’s mission is to publish the “research and data to make progress against the world’s largest problems”
-
properties of the estimators 1 of 2
-
properties of the parameters 2 of 2
-
142.9 KB
-
Assessing the model, fitted values and residuals, sums of squares
-
introducing ANOVA
-
using the ANOVA table
-
working with residuals
-
56.1 KB
-
84.6 KB
-
-
-
-
-
289 bytes
-
inference, confidence interval for beta1
-
inference about the parameters continued
-
confidence intervals and prediction intervals
-
154.0 KB
-
106.8 KB
-
579.5 KB
-
-
A mathematical model is an imitation of a real-world system or process. Models are used in almost every field of business, economics, science and industry where quantitative data are collected, the most fundamental type being Linear models. Despite their simplicity these models are very useful and also form the basis for more advanced statistical techniques covered in Level 6 modules. This module is concerned with both the theory and applications of linear models and covers problems of estimation, inference and interpretation. Graphical methods for model checking will be discussed and various model selection techniques introduced. You will also have gain hands on experience of developing and testing models with the R statistical package in the computer practical sessions.
-
-
891.8 KB
-
46.8 KB
-
99.4 KB
-
-
further model checks based on issues with the observation data
706.9 KB -
131.3 KB
-
179.5 KB
-
-
This coursework, to be completed in R programming, counts for 15% of your total module mark. Below you will find the coursework question and instructions (released at the end of week 4) and a submission point where you should upload your answer in a MS Word document. You will need the data set you found in week 2 to complete this coursework. The deadline for submission is 5pm UK time on Thursday 22 February (week 5). Late submissions will not be accepted.
-
You should submit your answer to the question in a single MS Word document that contains your R code and output copied from R-Studio as well as your own typed answers to the question above. The deadline is 5pm UK time on 22 February and late submissions or email submissions will not be accepted. You must only submit your own work using your own analysis in R.
-
329.0 KB
-
-
-
poorly fitting model diagnostics - pure error and lack of fit, expanded anova
530.3 KB -
77.0 KB
-
jankaNEW.csv (same file as that used in week 4 copied here for convenience)
397 bytes -
71.5 KB
-
187.4 KB
-
274.7 KB
-
training data size and "knowledge" test results for different AI systems up to chatGPT4
241 bytes -
influential observations
-
transforming the response variable
-
pure error and lack of fit
-
-
-
-
97.1 KB
-
106.5 KB
-
129.1 KB
-
why do we want to use matrix approaches to regression modelling?
425.8 KB -
matrix approaches 1 of 2
-
matrix approaches 2 of 2
-
maximum likelihood estimation introduction
-
MLE in the normal distribution
-
-
This week there is just one timetabled class - a lecture on Tuesday 5th March at 9:00am in the Peoples Palace Great Hall. This is because the university will be closed on Monday of week 11 for a bank holiday.
-
-
-
-
for the IT labs in week 8
note that this week the Thursday labs are moved to Monday and Friday by central timetabling. Please check your timetable
75.9 KB -
77.2 KB
-
171.4 KB
-
163.9 KB
-
-
-
introduction to multiple linear regression models and model building using F tests
961.7 KB -
event this week organised by QM Careers team should be helpful for anyone considering career as maths teacher
73.0 KB -
multiple linear regression models; matrix form
-
multiple regression ANOVA, overall F test
-
inference about betas; confidence intervals
-
-
This coursework which counts for 15% of the module mark will be released below at the end of week 8 and requires an understanding of how to analyse linear regression models using R programming.
-
This handwritten assessment is available for a period of 3 hours from the time you start the quiz, within which you must submit your solutions. You may log out and in again during that time, but the countdown timer will not stop. If your attempt is still in progress at the end of your 3 hours, any file you have uploaded will be automatically submitted. All submissions must be completed by the deadline for this assessment which is 5pm UK time on Wednesday 20th March (week 9). Late submissions and submissions by email will not be accepted. Please ensure that you leave sufficient time to upload your scanned pdf file to QM Plus within the time allowed.
This assessment is intended to be completed within 1 hour.
-
7.1 KB
-
77.2 KB
-
-
-
-
-
-
123.9 KB
-
116.0 KB
-
146.6 KB
-
-
slides used by Hugo Maruri-Aguilar in Thursday lecture and includes link to QM Plus page with more details on module selection for the third year
44.9 KB
-
-
-
The "Swiss" data set needed for this exercise is already found in R (no separate CSV file). To get the data use the command data(swiss) and then to check the data is loaded use the command head(swiss) to display the first 6 rows. The data is then ready for the rest of the exercise sheet.
71.7 KB -
230.5 KB
-
112.0 KB
-
150.9 KB
-
problems fitting models, variance inflation factor plus a re-cap of automated methods of model building
481.2 KB
-
-
-
74.1 KB
-
116.2 KB
-
103.2 KB
-
171.9 KB
-
-
-
-
-
-
-
-
-
-
-
-
-
74.1 KB
-
136.2 KB
-
101.8 KB
-
190.9 KB
-
week 11 and 12 Exercise sheets Bridge data saved as .csv file (easier to import into R)
1.2 KB -
revision lecture 1 of 2
exam information and revision on topics requested last week
523.9 KB -
902.2 KB
-
-
In this section you will find the exam papers and solutions from 2023 (both May and LSR). These two papers are a good guide to the type and level of questions that will be asked in the 2024 exams. The syllabus and lecturers are the same as last year and the 2023 exams were on campus (2020 - 22 were online). Whilst in 2023 students were permitted 3 pages of notes and this will not be the case in 2024, the 2023 papers were deliberately written in such a way that notes were not needed to answer questions (assuming of course that the course material had been revised).
-
148.4 KB
-
180.1 KB
-
182.2 KB
-
209.8 KB
-
This link will take you to all the past papers for this module. However the 2023 papers above are a better guide for the 2024 exam as earlier years were either online or before the module used R programming.
-
579.5 KB
-
181.3 KB
-
174.4 KB
-
-