FROM DATA TO KNOWLEDGE

Academic year
2023/2024 Syllabus of previous years
Official course title
DATI E CONOSCENZA
Course code
NS001D (AF:468124 AR:256191)
Modality
On campus classes
ECTS credits
6
Degree level
Minor
Educational sector code
SECS-S/01
Period
Summer course
Course year
1
Where
VENEZIA
Moodle
Go to Moodle page
The course is one of the training activities of the Minor in Computer and Data Science. The aim is to get the student familiar with the main statistical tools for data analysis and knowledge.

The course provides knowledge of descriptive statistics, probability and inference, as well as skills in the use of specific programs for analyzing data and reporting.

At the end of the course, the student will be able to identify suitable models and methodologies in the context of interest, guiding to appropriate decisions; moreover they will learn to interpret and communicate the outcome of statistical analysis.
1. Knowledge and understanding:
- to know the main tools for graphical representation and summary of a dataset,
- to know the basic concepts of probability calculus and distributions,
- to know the basic methodologies of statistical inference.

2. Ability to apply knowledge and understanding:
- to use specific programs for data analysis and reporting,
- to use the appropriate terminology in all the processes of application and communication of the acquired knowledge.

3. Ability to judge:
- to apply the acquired knowledge in a specific context, identifying the most appropriate models and methods.

4. Communication skills:
- to present in a clear and exhaustive way the results obtained from a statistical analysis, both in written and oral form,
- to know how to interact with the other students and with the instructor during the classes and on the virtual forum.

5. Learning skills:
- to use and integrate information from notes, books, slides and practical lab sessions,
- critically engage with the textbooks and other introductory material for data analysis.
Notions in mathematics at the level of high school and basic ability in the use of computer.
The course provides an introduction to statistics through practical examples and case studies.

Lessons will take 3 weeks.
1-2. The first and second weeks of the course aim to introduce students to the most commonly used techniques for the synthesis and graphical representation of a data set, as well as to the R program (https://cloud.r-project.org ) and the RStudio interface (https://www.rstudio.com ) for data synthesis, representation and analysis, and final reporting.
3. In the third week, a number of case studies are presented and discussed in detail. The theoretical lectures are always motivated by examples and applications to practical problems of interest in various fields.

Specifically, the statistical part consists of:
- elements of descriptive statistics: population and sample; types of variables; graphical representations and synthetic indices; relationships between variables;
- an introduction to stochastic uncertainty: statistical error and how it relates to statistical inference;
- introduction to regression methods.
- Rafael A. Irizarry (2019). "Introduction to Data Science: Data Analysis and Prediction Algorithms with R". https://rafalab.github.io/dsbook
- Mine Cetinkaya-Rundel, Johanna Hardin & OpenIntro (2023). "Introduction to Modern Statistics". https://openintro-ims.netlify.app/index.html
- Rebekah Robinson & Homer White (2016). "Elementary Statistics with R". http://homerhanumat.github.io/elemStats
- Hadley Wickham & Garrett Grolemund (2017). "R for Data Science". https://r4ds.had.co.nz/index.html

Supplementary readings:
- Other material recommended by the lecturer during the course.
The examination involves the writing and presentation of a statistical report written in R on a data set.

The aim of the assessment is to evaluate:
- the knowledge of the theory of the course topics,
- the ability to apply the theory to solve real problems.
- Lectures, practical lab sessions using R, analysis of case studies.
- Use of e-learning platforms for discussions and learning assessment.
- Open-source programs for data analysis and reporting.
Italian
Italian
written
This programme is provisional and there could still be changes in its contents.
Last update of the programme: 09/06/2023