IMAGE AND VIDEO UNDERSTANDING

Academic year
2024/2025 Syllabus of previous years
Official course title
IMAGE AND VIDEO UNDERSTANDING
Course code
CM0524 (AF:513735 AR:286765)
Modality
On campus classes
ECTS credits
6
Degree level
Master's Degree Programme (DM270)
Educational sector code
INF/01
Period
2nd Semester
Course year
1
Where
VENEZIA
Moodle
Go to Moodle page
The course aims at introducing the student to the principles, the algorithms and the main applications in the field of image and video understanding.
1. Knowledge and understanding
1.1. acquire the main models and algorithms of image and video understanding

2. Ability to apply knowledge and understanding
2.1. acquire the ability to apply the studied models to real problems
2.2. acquire the ability to critically assess the performance and the behavior of a model applied to a concrete problem

3. Judgement
3.1. ability to understand which characteristics of the various models of artificial intelligence are best suited to a given problem
3.2. ability to critically evaluate the theoretical characteristics of the proposed models
The student is expected to be familiar with the basic concepts of calculus, linear algebra and statistics. Knowledge of Python language, together with PyTorch, are recommended.
Neural Network Models for Images and Video:
- Artificial Neural Networks (training, tricks, optimizers)
- Convolutional Neural Networks
- Transformer Architectures
- Graph Neural Networks

Image Analysis:
- Classification
- Segmentation
- Object Detection

Video Understanding:
- Video Object Segmentation
- Object Tracking

Human-Centered Computer Vision:
- Person detection
- Face detection
- Pose Estimation
- Person Re-Identification
- Trajectory Forecasting
- Action Recognition
- Group Detection

Generative AI:
- Auto Encoders & Variational Autoencoders
- GANS
- Diffusion Models

Advanced Topics (tentative):
- Active Learning
- Anomaly Detection
- Multimodal Deep Learning
- Implicit Representation
- Scene Understanding
- R. Szeliski, Computer Vision: Algorithms and Applications. Springer.

- D. Forsyth and J. Ponce. Computer Vision: A modern Approach. Pearson.

- I. Goodfellow, Y. Bengio and A. Courville. Deep Learning. MIT Press
The exam consists of an oral test (70%) together with a discussion of a project (30%) agreed before with the teacher.
Powerpoint presentations and chalk talk.
English
To favor an "active" appraoch to the study of the topics covered in the classes, students will be asked to develop a simple project which will be discussed during the oral examination.
oral
Definitive programme.
Last update of the programme: 04/09/2024