Reinforcement Learning, M2 ISF App, 2021-2026

Instructor: Gabriel TURINICI


1/ Introduction to reinforcement learning
2/ Theoretical formalism: Markov decision processes (MDP), value function ( Belman and Hamilton- Jacobi – Bellman equations) etc.
3/ Common strategies, building from the example of « multi-armed bandit »
4/ Strategies in deep learning: Q-learning and DQN
5/ Strategies in deep learning: SARSA and variants
6/ Strategies in deep learning: Actor-Critic and variants
7/ During the course: various Python and gym/gymnasium implementations
8/ Perspectives.


Principal document for the theoretical presentations: (no distribution autoried without WRITTEN consent from the author) (see « teams » group for updated version)

Multi Armed Bandit codes (MAB) : play MAB, solve MAB , solve MAB v2., policy grad from chatGPT to correct., policy grad corrected.

Bellman iterations: code to correct here, solution code here

Gym: play Frozen Lake (v2023) (version 2022)

Q-Learning : with Frozen Lake: version to correct here. Full versions: python version or notebook version

-play with gym/Atari-Breakout: python version or notebook version

Value function iterations (Belmman) on FrozenLake : version to correct here

Deep Q Learning (DQN) : Learn with gym/Atari-Breakout: notebook 2024 and its version with smaller NN and play with result

Policy gradients on Pong adapted from Karpathy, 2024 version (correct to get it working!) python or notebook

You can also load from HERE a converged version (rename as necessary) pg_pong_converged_turinici24

Notebook to use it: here (please send me yours if mean reward above 15!).

Some links: parking human, parking AI

Projets : cf. Teams



Statistical Learning, M1 Math 2024-2026

Instructor: Gabriel TURINICI

Preamble: this course is just but an introduction, in a limited amount of time, to Statistical and Machine learning. This will prepare for the next year’s courses (some of them on my www page cf. « Deep Learning » and « Reinforcement Learning »).

 

Course outline

1/ Examples and machine learning framework

2/ Useful theoretical objects: predictors, loss functions, bias, variance

3/ K-nearest neighbors (k-NN) and the « curse of the dimensionality »

4/ Linear and logistic models in high dimension, variable selection and model regularization (ridge, lasso)

5/ Stochastic Optimization Algorithms

6/ Naive Bayesian classification

7/ Neural networks : introduction, operator, datasets, training, examples, implementations

8/ K-means clustering


Reference: 

Machine Learning Algorithms: From Classical Methods to Deep Neural Networks: Supervised, Unsupervised, and High-Dimensional


Exercices, implementations, current course textbook (no distribution autorized without WRITTEN consent from the author): see « teams » group.