PREDICTIVE ANALYTICS
- Academic year
- 2019/2020 Syllabus of previous years
- Official course title
- ANALISI PREDITTIVA
- Course code
- CT0429 (AF:248866 AR:136544)
- Modality
- On campus classes
- ECTS credits
- 6
- Degree level
- Bachelor's Degree Programme
- Educational sector code
- SECS-S/01
- Period
- 1st Semester
- Course year
- 3
- Where
- VENEZIA
- Moodle
- Go to Moodle page
Contribution of the course to the overall degree programme goals
This view covers the main concepts in linear models and generalized linear models (with their shrinkage versions), and more superficially the model-free approach based on nonparametric regression. The focus is placed on providing the main insights on the statistical/mathematical foundations of the models and on showing the effective implementation of the methods through the use of statistical software. This is achieved by a mixture of theory and reproducible code. Real data examples and case studies are also introduced.
Expected learning outcomes
Identify the most appropriate data analysis techniques for each problem and know how to apply the techniques for the analysis, design and solution of the problems.
Apply data processing techniques to real data of (possible) large size
Be able to generate new ideas (creativity) and anticipate new situations, in the contexts of data analysis and decision making.
* Specific competences
Use advanced linear algebra knowledge for its application in methods for analysing data.
Apply knowledge of programming and databases on which to base the teaching of technologies and advanced methods for the treatment of data.
Use classic results of inference and regression as a basis for advanced methods of prediction and classification.
Identify and select the appropriate software tools for the treatment of data.
Correctly identify the type of statistical problem corresponding to certain objectives and data, as well as the most appropriate methodologies to apply to the given objectives and data.
Know how to design specific data processing systems for a type of statistical problem (classification, estimation, prediction, etc.)
Pre-requirements
Calculus 1
Calculus 2
Algebra
Probability and Statistics
Data Analysis
although it is not formally required to have passed the examination.
Contents
1.1 Course overview
1.2 What is predictive modeling ?
1.3 General notation and background
2. Linear models I: multiple linear model
2.1 Model formulation and least squares
2.2 Assumptions of the model
2.3 Inference for model parameters
2.4 Prediction
2.5 ANOVA
2.6 Model fit
3. Linear models II: model selection, extensions, and diagnostics
3.1 Model selection
3.2 Use of qualitative predictors
3.3 Nonlinear relationships
3.4 Model diagnostics
3.5 Dimension reduction techniques
4. Linear models III: shrinkage and big data
4.1 Shrinkage
4.2 Big data considerations
5. Generalized linear models
5.1 Model formulation and estimation
5.2 Inference for model parameters
5.3 Prediction
5.4 Deviance
5.5 Model selection
5.6 Model diagnostics
5.7 Shrinkage
6. Nonparametric methods
6.1 Density estimation
6.2 Regression estimation
Referral texts
Julian J. Faraway, 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models, Second Edition Chapman and Hall/CRC
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer
Assessment methods
1. the theoretical knowledge of the course topics,
2. the ability to apply them for solving real data problems.
The maximal score for each exercise is 8 points. The final score is the sum of the scores of the four exercises. A total score exceeding 30 corresponds to 30 with honors. During the written test the use of books, notes, or electronic media is *not* allowed.
Teaching methods
Students are encouraged to bring their own laptops and to experience with the code during some parts of the lessons.