diff --git a/notebooks/machine-learning/0.about-datasets.ipynb b/notebooks/machine-learning/0.about-datasets.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..80e312b82fc009db9c603bd4c8d4f0ec464a1910
--- /dev/null
+++ b/notebooks/machine-learning/0.about-datasets.ipynb
@@ -0,0 +1,213 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "41a3ca3d-3503-4fe0-87ff-b6692d90b204",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "# À propos des jeux de données"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "bc0377fc-c0fc-439e-afe7-7103b54ca183",
+   "metadata": {},
+   "source": [
+    "## Self-Reports of Height and Weight"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d0f57b32-dc16-4b2d-b7fd-3f6bf227cc1f",
+   "metadata": {},
+   "source": [
+    "**Fichier :** [davis.csv](./files/davis.csv)  \n",
+    "**Clé de citation :** Davis, 1990\n",
+    "\n",
+    "Des hommes et des femmes engagé·es dans un programme d’exercices ont dû évaluer leur taille et leur poids. Le résultat est comparé aux valeurs mesurées."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6740853c-0ec9-408a-a34d-bfb6d50f06f2",
+   "metadata": {},
+   "source": [
+    "### Contenu"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e8213cb4-86c0-4d1c-a9a5-48b90a753754",
+   "metadata": {},
+   "source": [
+    "|Variable|Signification|\n",
+    "|:-:|-|\n",
+    "|*sex*|Facteur à deux niveaux : female ou male|\n",
+    "|*weight*|Poids mesuré (en kg)|\n",
+    "|*height*|Taille mesurée (en cm)|\n",
+    "|*repwt*|Poids évalué par l’individu (en kg)|\n",
+    "|*repht*|Taille évaluée par l’individu (en cm)|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1958727a-eba3-4782-a7bc-70bdbe300fa9",
+   "metadata": {},
+   "source": [
+    "### Références"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2a1fcec1-d6e3-4998-b331-d3b4a69b6895",
+   "metadata": {},
+   "source": [
+    "- Davis, C. (1990) Body image and weight preoccupation: A comparison between exercising and non-exercising women. *Appetite*, 15, 13–21.\n",
+    "- Fox, J. (2016) *Applied Regression Analysis and Generalized Linear Models*, Third Edition. Sage.\n",
+    "- Fox, J. and Weisberg, S. (2019) *An R Companion to Applied Regression*, Third Edition, Sage. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "673a814e-18f1-4cb7-9b6e-6e7dba4b630a",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Size measurements for adult foraging penguins near Palmer Station, Antarctica"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cb914f97-6b95-428b-85f9-100eb3fadb61",
+   "metadata": {},
+   "source": [
+    "**Fichier :** [penguin-census.csv](./files/penguin-census.csv)  \n",
+    "**Clé de citation :** Gorman, 2014\n",
+    "\n",
+    "L’enquête recense trois espèces de manchots en détaillant certaines de leurs caractéristiques physiques. Les données ont été collectées par le Dr. Kristen Gorman à la station Palmer en Antarctique."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ba56dc73-26b5-49e1-ac59-1c52cb46dea8",
+   "metadata": {},
+   "source": [
+    "### Contenu"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "830a223d-07f9-4821-b2d7-c52210672ee0",
+   "metadata": {},
+   "source": [
+    "|Variable|Signification|\n",
+    "|:-:|-|\n",
+    "|*species*|Espèce de manchot parmi : Adelie, Gentoo, Chinstrap|\n",
+    "|*island*|Île de l’observation parmi : Torgersen, Biscoe, Dream|\n",
+    "|*bill_length_mm*|Longueur du bec de l’individu (en mm)|\n",
+    "|*bill_depth_mm*|Épaisseur du bec de l’individu (en mm)|\n",
+    "|*flipper_length_mm*|Longueur de la nageoire de l’individu (en mm)|\n",
+    "|*body_mass_g*|Poids de l’individu (en g)|\n",
+    "|*sex*|Sexe de l’individu selon deux valeurs possibles : *male* ou *female*|\n",
+    "|*year*|Année de l’observation (de 2007 à 2009)|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "861e02fd-ac31-4abb-a22d-0fb70d1cecb4",
+   "metadata": {},
+   "source": [
+    "### Références"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aedbc4aa-8b45-41d4-b1f7-d708ea041023",
+   "metadata": {},
+   "source": [
+    "- Gorman KB, Williams TD, Fraser WR (2014). Ecological sexual dimorphism and environmental variability within a community of Antarctic penguins (genus Pygoscelis). PLoS ONE 9(3):e90081. https://doi.org/10.1371/journal.pone.0090081"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e89c7a5f-1132-4607-9f20-9cf3e7aa4cbb",
+   "metadata": {},
+   "source": [
+    "## Stellar Objects"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f1584440-6e46-403a-90a4-e104df016614",
+   "metadata": {},
+   "source": [
+    "**Fichier :** [stellar-objects.csv](./files/stellar-objects.csv)  \n",
+    "**Clé de citation :** Freedman, 2001\n",
+    "\n",
+    "Le fichier recense plusieurs objets stellaires avec leur vitesse d’éloignement. Il a été reconstitué à partir des données de l’article de Freedman et al."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d310c9fb-9f42-44ad-bcc9-bb38431096d8",
+   "metadata": {},
+   "source": [
+    "### Contenu"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "08f3bf42-c004-4bcf-9599-94fd1cfa6011",
+   "metadata": {},
+   "source": [
+    "|Variable|Signification|\n",
+    "|:-:|-|\n",
+    "|*object*|Désignation de l’objet stellaire|\n",
+    "|*distance*|Distance en mégaparsecs (1 parsec = 3,26 années-lumières)|\n",
+    "|*v_helio*|Vitesse radiale (en km/s)|\n",
+    "|*v_flow*|Vitesse d’écoulement (en km/s)|\n",
+    "|*v_cmb*|Vitesse relative au fond diffus cosmologique (en km/s)|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5f1622a1-afe5-4a63-b600-acfe0900534a",
+   "metadata": {},
+   "source": [
+    "### Références"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223c87d3-74a8-45fe-b707-9f7556410065",
+   "metadata": {},
+   "source": [
+    "- Freedman, W., Madore, B., Gibson, B., et al. – \"Final Results from the Hubble Space Telescope Key Project to Measure the Hubble Constant\". *The Astrophysical Journal*, n° 553, p. 47-72, 2001. <https://doi.org/10.48550/arXiv.astro-ph/0012376>"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/notebooks/machine-learning/1.machine-learning.ipynb b/notebooks/machine-learning/1.machine-learning.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..2c54adce00d4b6b619f39391c69d0e11d6d88444
--- /dev/null
+++ b/notebooks/machine-learning/1.machine-learning.ipynb
@@ -0,0 +1,469 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "b4736480-e9b8-446f-aefd-df4eba3e7c67",
+   "metadata": {},
+   "source": [
+    "# L’apprentissage automatique"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "de31a2c2-e401-40ad-a0d4-c7d7459ecd1a",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Définition"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c2d1c628-ce48-4939-af73-815e661c3259",
+   "metadata": {},
+   "source": [
+    "Par apprentissage automatique (*machine learning* en anglais), on désigne l’ensemble des méthodes mathématiques et statistiques qui programment un ordinateur dans le but de l’aider à améliorer sa faculté à résoudre des tâches.\n",
+    "\n",
+    "Traditionnellement, on distingue deux grandes méthodes d’aprentissage :\n",
+    "- L’apprentissage **supervisé**, grâce auquel le système apprend à partir de données annotées ;\n",
+    "- l’apprentissage **non supervisé**, où le système est entraîné à détecter quels traits, parmi toutes les variables d’un jeu de données, lui permettront d’en révéler la structure sous-jacente.\n",
+    "\n",
+    "L’ambition fondamentale de l’apprentissage automatique est de fournir des modèles prédictifs ou d’effectuer des opérations de détection (anomalies, nouveautés, similitudes…). Et la grande force d’un système intelligent de type *machine learning* réside dans sa faculté à généraliser le résultat de son apprentissage à des cas auxquels il n’a jamais été confronté auparavant."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e83fe41c-8b47-46d3-b48b-4fc51e549351",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Focus sur l’apprentissage supervisé"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d084819f-fc71-4885-822c-43eef6e34508",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### De l’importance des données"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1d5e4dc-ed21-47fe-8f95-d7253dbadcf0",
+   "metadata": {
+    "jp-MarkdownHeadingCollapsed": true,
+    "tags": []
+   },
+   "source": [
+    "Afin de programmer efficacement un modèle prédictif dans le cadre d’un apprentissage supervisé, il est impératif d’avoir à disposition un jeu de données annotées qui soit à la fois fiable, équilibré et aussi large que possible. Toute donnée manquante ou toute erreur d’annotation pèsera davantage dans la performance du modèle si le volume de données est faible. Dans le même ordre d’idée, une modalité sur-représentée dans le jeu de données d’entraînement aura plus tendance à être affectée lors de la résolution de la tâche. Si par exemple vous entraînez un outil à classer des cartes à jouer selon les modalités nombre ou figure et que dans le jeu d’entraînement vous n’incluez que des cartes avec une valeur numérique, votre modèle ne détectera jamais les figures.\n",
+    "\n",
+    "Le proverbe à garder en tête : *rubbish in, rubbish out*. Si vous fournissez des données absurdes en entrée, le système fournira des données absurdes en sortie. Contrairement à la logique humaine, il semblerait que des prémisses fausses dans un argument informatique ne puissent pas encore aboutir à une conclusion vraie !\n",
+    "\n",
+    "> Les oiseaux n’ont pas d’aile.  \n",
+    "> Socrate est un oiseau.  \n",
+    "> Socrate n’a pas d’aile."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "414fb80d-44fe-4597-8360-4bfc780d8828",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Des algorithmes en Å“uvre"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "377f2e24-2568-48f0-867c-1d0342bd2ae5",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "Deux grandes familles d’algorithmes se disputent la programmation d’un système intelligent en fonction de la nature de la tâche :\n",
+    "- Soit la prévision est dite *qualitative* (ou *discrète*) et l’on parle de **classification** ;\n",
+    "- soit elle est dite *quantitative* (ou *continue*) et l’on parle alors de **régression**.\n",
+    "\n",
+    "Par exemple, une tâche de classification serait de déterminer si une critique est positive ou négative, si tel arbre tient plutôt du chêne ou du bouleau, si une personne est riche ou pauvre, etc. Pour la régression, on chercherait plutôt à estimer quel est le salaire qu’un·e étudiant·e peut espérer à la sortie d’un diplôme, quelles sont les températures attendues pour les prochains jours, ou encore à quel prix pourrait se vendre un T2 avec terrasse dans le 12e arrondissement de Paris."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "65fb5452-6e40-40cf-b6fc-5f4d4943533a",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### De l’art de paramétrer un modèle"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "17c6f5db-eada-4f31-8324-2304e7fa3f5b",
+   "metadata": {},
+   "source": [
+    "Grâce aux bibliothèques spécialisées, la mise en place d’un *workflow* pour l’apprentissage est somme toute assez triviale. Il s’agira grossièrement de :\n",
+    "1. Partitionner le jeu de données en deux parties inégales (données d’entraînement et données de test) ;\n",
+    "2. entraîner le modèle et le tester avec des données vérifiées ;\n",
+    "3. évaluer la performance du modèle.\n",
+    "\n",
+    "Le véritable travail s’effectue en amont, autant dans la compréhension des données que dans leur préparation. La phase de *pre-processing* est cruciale dans un projet de *machine learning* et peut elle-même utiliser des algorithmes d’apprentissage automatique (ex. : détection d’anomalies, réduction de la dimensionnalité…).\n",
+    "\n",
+    "Les opérations à réaliser impliqueront de nettoyer le *dataset* en supprimant par exemple les données aberrantes (comme des revenus salariaux négatifs), en les corrigeant (attribution d’une mauvaise étiquette) ou encore en les normalisant (format des dates, conversion d’une donnée catégorielle vers un type numérique).\n",
+    "\n",
+    "Le modèle obtenu, il restera à ajuster finement les paramètres afin d’améliorer la mesure de performance."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dcd3655b-d114-42cc-bc96-e20dd0723739",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Une affaire de manchots"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4001fc66-9891-4be4-8a09-e09cd2fdbdac",
+   "metadata": {},
+   "source": [
+    "À partir de la description de certaines caractéristiques physiques de trois espèces de manchots de l’Antarctique (manchot Adélie, manchot papou et manchot à jugulaire), imaginons pour objectif de fournir un programme qui déterminerait à quelle espèce tel ou tel nouvel individu appartiendrait.\n",
+    "\n",
+    "|Longueur du bec|Épaisseur du bec|Longueur des nageoires|Masse|Espèce|\n",
+    "|-:|-:|-:|-:|:-:|\n",
+    "|39.1|18.7|181|3750|Adélie|\n",
+    "|37.8|18.3|174|3400|Adélie|\n",
+    "|49.6|16|225|5700|Gentoo|\n",
+    "|42.7|13.7|208|3950|Gentoo|\n",
+    "|49.3|19.9|203|4050|Chinstrap|\n",
+    "|43.5|18.1|202|3400|Chinstrap|"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "05323c1f-6b9f-4773-8670-2d51e2cec06b",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Quelques observations"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "db6b6d6b-e5a2-4ee8-85fa-486370180693",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "Sur la seule base des caractéristiques fournies dans le tableau ci-dessus, en dehors du fait que le nombre d’individus est insuffisant, on remarque que :\n",
+    "- Les Gentoos (manchots papous) ont plus de masse que les deux autres ;\n",
+    "- la masse seule ne permet pas de différencier les Adélie des Chinstrap (manchots à jugulaire) ;\n",
+    "- que la longueur du bec ne peut isoler que les Adélie ;\n",
+    "- … et que la longueur des nageoires est également insuffisante pour discriminer les trois espèces."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e26a0356-ab69-44c5-abed-29ee5cc372ae",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Visualiser les données"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fc513033-4929-4aec-a14e-214249ac38e5",
+   "metadata": {
+    "jp-MarkdownHeadingCollapsed": true,
+    "tags": []
+   },
+   "source": [
+    "L’idée est alors de comparer deux caractéristiques pour dégager des associations nettes et, dans ce domaine, rien de tel qu’un diagramme pour effectuer rapidement des observations.\n",
+    "\n",
+    "Essayons sur [le jeu de données complet](./0.about-datasets.ipynb#Size-measurements-for-adult-foraging-penguins-near-Palmer-Station,-Antarctica) (Gorman, 2014) avec les deux premières caractéristiques, la longueur et l’épaisseur du bec des différentes espèces :\n",
+    "\n",
+    "![Répartition des espèces de manchots en fonction des dimensions de leur bec](./images/bill-dimensions.png)\n",
+    "\n",
+    "Et maintenant avec toutes les caractéristiques entre elles :\n",
+    "\n",
+    "![Répartition des espèces de manchots en fonction de leurs caractéristiques physiques](./images/penguin-dimensions.png)\n",
+    "\n",
+    "En cherchant les appariements où les cas de chevauchement sont les plus limités, il apparaît que la longueur du bec est la plus discriminante, surtout quand elle est associée à l’épaisseur du bec. L’idée que deux dimensions d’un même organe soient corrélées n’a en plus rien d’aberrant. Si les données avaient été nettement plus volumineuses, il aurait été profitable d’agréger les deux variables. On parle alors d’extraction de variables (*features extraction*) dans le cadre d’une réduction de dimension."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3cc1f05b-82a3-488d-8f0b-f195fe2fc6f0",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Distribution des données"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "be94eaf6-4b3f-45c5-9998-46442f1ec8a3",
+   "metadata": {},
+   "source": [
+    "Dans le jeu de données, le dénombrement des effectifs montre la répartition suivante :\n",
+    "- Adélie : 152\n",
+    "- Gentoo : 124\n",
+    "- Chinstrap : 68\n",
+    "\n",
+    "Le déséquilibre entre les résultats pose la question de la représentativité : les manchots à jugulaire sont-ils deux fois moins représentés en Antarctique que les deux autres espèces ?"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7857c464-6d0b-4504-9b50-2556d11aa55c",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Partitionnement des jeux d’entraînement et de test"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9fd19af-35b6-41ac-a02c-666d0094a610",
+   "metadata": {},
+   "source": [
+    "De manière habituelle, on conseille un partitionnement 80/20 ou 75/25 entre le sous-ensemble avec lequel on entraîne un programme et celui avec lequel on va le tester. Sur 344 individus dans le jeu de données, on en sélectionne donc 275 pour le jeu d’entraînement et 69 pour le jeu de test.\n",
+    "\n",
+    "On veillera également à ce qu’une espèce ne soit pas sur-représentée. Dans notre cas, les observations étant triées par espèce, sont recensés d’abord les Adélie, puis les Gentoo et enfin les Chinstrap. Comme ces derniers ne sont qu’au nombre de 68 et que le jeu de test sera constitué de 69 individus, ils ne seront pas du tout présents dans le jeu d’entraînement ! Le système sera donc incapable d’effectuer des prédictions convenables. Pour remédier à ce problème, il faudra donc veiller à mélanger les observations avant de constituer les jeux d’entraînement et de test."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6fdeba0b-dc3b-41f9-8d4e-2940a5e47d1c",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Évaluation de la performance du modèle"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "66f74792-7856-4a87-8540-dc5f2337ac39",
+   "metadata": {},
+   "source": [
+    "Une fois le modèle entraîné, la dernière étape avant de le confronter à des données inédites consiste à le mesurer au jeu de test et à comparer les prévisions avec les annotations. Le premier résultat à considérer est le taux de succès en divisant le nombre de prédictions correctes avec le nombre total d’observations dans le jeu de test (69).\n",
+    "\n",
+    "Prenons le cas fictif où les cinq premières observations et prévisions seraient :\n",
+    "\n",
+    "|n|observation|prévision|concordance|\n",
+    "|-:|-|-|-|\n",
+    "|0|Adelie|Adelie|vrai|\n",
+    "|1|Gentoo|Adelie|faux|\n",
+    "|2|Gentoo|Gentoo|vrai|\n",
+    "|3|Chinstrap|Chinstrap|vrai|\n",
+    "|4|Gentoo|Gentoo|vrai|\n",
+    "\n",
+    "Quatre prévisions correctes sur cinq donne un taux de succès de 80 %. On parle alors d’exactitude (*accuracy*).\n",
+    "\n",
+    "Pour une tâche de régression, on aurait sélectionné une autre mesure de la performance, comme l’erreur quadratique moyenne (*mean squared error*) ou l’erreur absolue moyenne (*mean absolute error*)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "59e0b891-4199-4bce-a405-ac7e0be91ee6",
+   "metadata": {},
+   "source": [
+    "## Les pièges de l’apprentissage automatique"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "40f687ed-f7f8-46fb-a206-29a18efbcfda",
+   "metadata": {},
+   "source": [
+    "Ce tour d’horizon des concepts-clés du *machine learning* ne saurait être complet sans évoquer quelques-uns des biais inhérents aux modèles statistiques."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5233df69-8275-4c68-983b-9f1334712163",
+   "metadata": {},
+   "source": [
+    "### Des données de mauvaise qualité"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5e21753f-ce7c-432e-be83-4ca41573d7e7",
+   "metadata": {},
+   "source": [
+    "Inutile de revenir sur cette évidence, si vous dites à un enfant qu’une pomme est une orange, l’ordinateur ne saurait être plus intelligent que lui et considérera face à une pomme qu’il a affaire à une orange. Il n’y a guère que Humpty Dumpty, l’œuf philosophe de *Through the Looking-Glass, and What Alice Found There*, pour décider qu’une pomme peut être une orange, et comprendre qu’il s’agit d’une pomme :\n",
+    "\n",
+    "> ‘When I use a word,’ Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean – neither more nor less.’\n",
+    "> \n",
+    "> ‘The question is,’ said Alice, ‘whether you can make words mean so many different things.’\n",
+    "> \n",
+    "> ‘The question is,’ said Humpty Dumpty, ‘which is to be master – that’s all’\n",
+    "\n",
+    "Rassurons-nous, le pouvoir chamanique de nommer les choses ressortit encore à l’humain ! Charge à nous de contestons à Humpty Dumpty sa fonction de *maître des significations* (Castoriadis) et, en attendant, le temps consacré au nettoyage des données n’est jamais perdu, que ce soit pour les compléter, les corriger, les normaliser, voire les supprimer."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "03c83199-51e4-45a4-b412-b36163e1836e",
+   "metadata": {},
+   "source": [
+    "### Des données qui ne sont pas représentatives"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2fb917ff-41e6-4017-a9d4-97c17b059126",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "#### Le bruit d’échantillonnage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2c442d1e-2ef5-4939-943b-d4130c0d10da",
+   "metadata": {},
+   "source": [
+    "Lorsque l’échantillon est trop réduit, il ne parvient pas à rendre compte de la réalité. Si vous entraînez un modèle dessus, vous aurez beau obtenir une évaluation encourageante validée par plusieurs méthodes statistiques, les prédictions sur de nouvelles données ne seront guère convaincantes.\n",
+    "\n",
+    "Le modèle linéaire ci-dessous, issu d’une [enquête sur les troubles alimentaires](./0.about-datasets.ipynb#Self-Reports-of-Height-and-Weight) (Davis, 1990), montre la relation entre la masse corporelle d’une personne et sa taille pour un échantillon de 20 individus :\n",
+    "\n",
+    "![Relation entre poids et taille](./images/davis-wh20.png)\n",
+    "\n",
+    "Le modèle suivant prend quant à lui un échantillon de 40 individus :\n",
+    "\n",
+    "![Relation entre poids et taille](./images/davis-wh40.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "01ebb898-0551-4ff6-a9c8-19aec3fed9ba",
+   "metadata": {},
+   "source": [
+    "En doublant l’effectif, non seulement la droite de régression a une pente plus forte, mais l’intervalle de confiance à 95 % se resserre."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "dec97353-28e7-446f-bbfc-a3c125ff7750",
+   "metadata": {},
+   "source": [
+    "#### Le biais d’échantillonnage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "916970bf-9317-43c3-a4e3-fbf59b5ea80e",
+   "metadata": {},
+   "source": [
+    "Disposer de milliers voire de millions d’observations ne garantit pas d’obtenir un modèle robuste. Tout peut dépendre en effet de la manière dont l’enquête aura été construite au départ. Demander aux membres du club des supporters du PSG s’ils aiment le football ne sera pas représentatif de l’avis de la population générale. Pas plus que d’effectuer une recherche sur Deezer sur les tendances actuelles sachant que les algorithmes auront déjà été entraînés sur vos écoutes précédentes et qu’ils les auront comparées avec les goûts d’autres abonné·es au profil similaire."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b4314a3b-ed5d-4a12-a634-38dd18633edd",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Des variables explicatives qui ne sont pas pertinentes"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "81c4f822-13d1-4377-89e1-965fe265b118",
+   "metadata": {},
+   "source": [
+    "L’esprit humain a une tendance naturelle à établir une relation de cause à effet entre deux événements. Observer que 55 % des électeurs de J.-L. Mélenchon mangent du fromage et boivent de la bière, quand c’est le cas de seulement 23 % des pro-Macron, ne permet pas de conclure qu’un amateur de fromage va sans doute voter pour le premier plutôt que pour le second, et encore moins de formuler une hypothèse selon laquelle une bactérie du camembert influerait sur la décision face aux urnes.\n",
+    "\n",
+    "Si l’exemple précédent est inventé, le suivant montre une corrélation entre l’évolution du nombre de cas d’autisme dans les établissements scolaires aux États-Unis et d’une part l’évolution de la proportion d’OGM dans les surfaces agricoles, d’autre part l’évolution du volumes de ventes réalisées par l’industrie alimentaire biologique :\n",
+    "\n",
+    "![Évolution du nombre de cas d’autisme](./images/evolution-autism.png)\n",
+    "\n",
+    "Comme en plus le calcul du coefficient de corrélation de Pearson montre une relation plus forte entre le bio et les cas d’autisme (0,99 contre 0,97 pour les OGM), une interprétation rapide en déduirait l’hypothèse selon laquelle l’agriculture biologique est plus propice à l’apparition de troubles autistiques que les OGM.\n",
+    "\n",
+    "Pour les sources des données :\n",
+    "\n",
+    ">- [Students With Disabilities](http://nces.ed.gov/programs/coe/indicator_cgg.asp)\n",
+    ">- [Evolution of planted agricultural areas](http://usda.mann.library.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1000)\n",
+    ">- [Organic Industry Survey](http://ota.com/resources/organic-industry-survey)\n",
+    "\n",
+    "Et pour terminer sur une pointe d’humour, le site *Le Monde* propose un [générateur aléatoire de comparaisons absurdes](https://www.lemonde.fr/les-decodeurs/article/2019/01/02/correlation-ou-causalite-brillez-en-societe-avec-notre-generateur-aleatoire-de-comparaisons-absurdes_5404286_4355770.html).\n",
+    "\n",
+    "En conclusion, *cum hoc sed non propter hoc* (La corrélation n’implique pas la causalité)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "926bc758-f060-44fe-a5ac-bd11bcf67ab4",
+   "metadata": {},
+   "source": [
+    "### Des problèmes d’ajustement"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "615d9158-4103-4bf4-8087-8972c838e217",
+   "metadata": {},
+   "source": [
+    "Les systèmes sont tout autant soumis au problème du sur-entraînement (*overfitting*) ou du sous-entraînement (*underfitting*). Un algorithme trop simple ne pourra mettre en évidence la structure des données quand un algorithme trop compliqué – parce que parfaitement ajusté aux données sur lesquelles il s’est entraîné – provoquera des erreurs de généralisation importantes.\n",
+    "\n",
+    "On peut estimer par exemple qu’un modèle linéaire sous-ajustera systématiquement par rapport à la réalité et que, a contrario, un modèle polynomial de très haut degré sur-ajustera tellement que ses prédictions se révéleront toutes fausses."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "379d864d-0d7e-43c2-b268-c3e674083691",
+   "metadata": {},
+   "source": [
+    "## Pour aller plus loin"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "587d9328-3564-478f-82c3-02359ca6cea3",
+   "metadata": {},
+   "source": [
+    "* Géron, Aurélien. – [*Hands-on Machine Learning With Scikit-learn, Keras, and Tensorflow: Concepts, Tools, and Techniques to Build Intelligent Systems*](https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/). 3e édition. – Farnham : O'Reilly UK Limited, 2022. – 850 p. – ISBN : 978-1098125974.\n",
+    "* Géron, Aurélien. – [*Machine Learning Notebooks, 3rd edition*](https://github.com/ageron/handson-ml3) (GitHub)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/notebooks/machine-learning/files/davis.csv b/notebooks/machine-learning/files/davis.csv
new file mode 100644
index 0000000000000000000000000000000000000000..92effcd007b3a3c5e7c501e72d3a87d4c9476b93
--- /dev/null
+++ b/notebooks/machine-learning/files/davis.csv
@@ -0,0 +1,201 @@
+"","sex","weight","height","repwt","repht"
+"1","M",77,182,77,180
+"2","F",58,161,51,159
+"3","F",53,161,54,158
+"4","M",68,177,70,175
+"5","F",59,157,59,155
+"6","M",76,170,76,165
+"7","M",76,167,77,165
+"8","M",69,186,73,180
+"9","M",71,178,71,175
+"10","M",65,171,64,170
+"11","M",70,175,75,174
+"12","F",166,57,56,163
+"13","F",51,161,52,158
+"14","F",64,168,64,165
+"15","F",52,163,57,160
+"16","F",65,166,66,165
+"17","M",92,187,101,185
+"18","F",62,168,62,165
+"19","M",76,197,75,200
+"20","F",61,175,61,171
+"21","M",119,180,124,178
+"22","F",61,170,61,170
+"23","M",65,175,66,173
+"24","M",66,173,70,170
+"25","F",54,171,59,168
+"26","F",50,166,50,165
+"27","F",63,169,61,168
+"28","F",58,166,60,160
+"29","F",39,157,41,153
+"30","M",101,183,100,180
+"31","F",71,166,71,165
+"32","M",75,178,73,175
+"33","M",79,173,76,173
+"34","F",52,164,52,161
+"35","F",68,169,63,170
+"36","M",64,176,65,175
+"37","F",56,166,54,165
+"38","M",69,174,69,171
+"39","M",88,178,86,175
+"40","M",65,187,67,188
+"41","F",54,164,53,160
+"42","M",80,178,80,178
+"43","F",63,163,59,159
+"44","M",78,183,80,180
+"45","M",85,179,82,175
+"46","F",54,160,55,158
+"47","M",73,180,NA,NA
+"48","F",49,161,NA,NA
+"49","F",54,174,56,173
+"50","F",75,162,75,158
+"51","M",82,182,85,183
+"52","F",56,165,57,163
+"53","M",74,169,73,170
+"54","M",102,185,107,185
+"55","M",64,177,NA,NA
+"56","M",65,176,64,172
+"57","F",66,170,65,NA
+"58","M",73,183,74,180
+"59","M",75,172,70,169
+"60","M",57,173,58,170
+"61","M",68,165,69,165
+"62","M",71,177,71,170
+"63","M",71,180,76,175
+"64","F",78,173,75,169
+"65","M",97,189,98,185
+"66","F",60,162,59,160
+"67","F",64,165,63,163
+"68","F",64,164,62,161
+"69","F",52,158,51,155
+"70","M",80,178,76,175
+"71","F",62,175,61,171
+"72","M",66,173,66,175
+"73","F",55,165,54,163
+"74","F",56,163,57,159
+"75","F",50,166,50,161
+"76","F",50,171,NA,NA
+"77","F",50,160,55,150
+"78","F",63,160,64,158
+"79","M",69,182,70,180
+"80","M",69,183,70,183
+"81","F",61,165,60,163
+"82","M",55,168,56,170
+"83","F",53,169,52,175
+"84","F",60,167,55,163
+"85","F",56,170,56,170
+"86","M",59,182,61,183
+"87","M",62,178,66,175
+"88","F",53,165,53,165
+"89","F",57,163,59,160
+"90","F",57,162,56,160
+"91","M",70,173,68,170
+"92","F",56,161,56,161
+"93","M",84,184,86,183
+"94","M",69,180,71,180
+"95","M",88,189,87,185
+"96","F",56,165,57,160
+"97","M",103,185,101,182
+"98","F",50,169,50,165
+"99","F",52,159,52,153
+"100","F",55,155,NA,154
+"101","F",55,164,55,163
+"102","M",63,178,63,175
+"103","F",47,163,47,160
+"104","F",45,163,45,160
+"105","F",62,175,63,173
+"106","F",53,164,51,160
+"107","F",52,152,51,150
+"108","F",57,167,55,164
+"109","F",64,166,64,165
+"110","F",59,166,55,163
+"111","M",84,183,90,183
+"112","M",79,179,79,171
+"113","F",55,174,57,171
+"114","M",67,179,67,179
+"115","F",76,167,77,165
+"116","F",62,168,62,163
+"117","M",83,184,83,181
+"118","M",96,184,94,183
+"119","M",75,169,76,165
+"120","M",65,178,66,178
+"121","M",78,178,77,175
+"122","M",69,167,73,165
+"123","F",68,178,68,175
+"124","F",55,165,55,163
+"125","M",67,179,NA,NA
+"126","F",52,169,56,NA
+"127","F",47,153,NA,154
+"128","F",45,157,45,153
+"129","F",68,171,68,169
+"130","F",44,157,44,155
+"131","F",62,166,61,163
+"132","M",87,185,89,185
+"133","F",56,160,53,158
+"134","F",50,148,47,148
+"135","M",83,177,84,175
+"136","F",53,162,53,160
+"137","F",64,172,62,168
+"138","F",62,167,NA,NA
+"139","M",90,188,91,185
+"140","M",85,191,83,188
+"141","M",66,175,68,175
+"142","F",52,163,53,160
+"143","F",53,165,55,163
+"144","F",54,176,55,176
+"145","F",64,171,66,171
+"146","F",55,160,55,155
+"147","F",55,165,55,165
+"148","F",59,157,55,158
+"149","F",70,173,67,170
+"150","M",88,184,86,183
+"151","F",57,168,58,165
+"152","F",47,162,47,160
+"153","F",47,150,45,152
+"154","F",55,162,NA,NA
+"155","F",48,163,44,160
+"156","M",54,169,58,165
+"157","M",69,172,68,174
+"158","F",59,170,NA,NA
+"159","F",58,169,NA,NA
+"160","F",57,167,56,165
+"161","F",51,163,50,160
+"162","F",54,161,54,160
+"163","F",53,162,52,158
+"164","F",59,172,58,171
+"165","M",56,163,58,161
+"166","F",59,159,59,155
+"167","F",63,170,62,168
+"168","F",66,166,66,165
+"169","M",96,191,95,188
+"170","F",53,158,50,155
+"171","M",76,169,75,165
+"172","F",54,163,NA,NA
+"173","M",61,170,61,170
+"174","M",82,176,NA,NA
+"175","M",62,168,64,168
+"176","M",71,178,68,178
+"177","F",60,174,NA,NA
+"178","M",66,170,67,165
+"179","M",81,178,82,175
+"180","M",68,174,68,173
+"181","M",80,176,78,175
+"182","F",43,154,NA,NA
+"183","M",82,181,NA,NA
+"184","F",63,165,59,160
+"185","M",70,173,70,173
+"186","F",56,162,56,160
+"187","F",60,172,55,168
+"188","F",58,169,54,166
+"189","M",76,183,75,180
+"190","F",50,158,49,155
+"191","M",88,185,93,188
+"192","M",89,173,86,173
+"193","F",59,164,59,165
+"194","F",51,156,51,158
+"195","F",62,164,61,161
+"196","M",74,175,71,175
+"197","M",83,180,80,180
+"198","M",81,175,NA,NA
+"199","M",90,181,91,178
+"200","M",79,177,81,178
diff --git a/notebooks/machine-learning/files/penguin-census.csv b/notebooks/machine-learning/files/penguin-census.csv
new file mode 100644
index 0000000000000000000000000000000000000000..7ab1ec151040e81e393749d43bcd662acc158cbe
--- /dev/null
+++ b/notebooks/machine-learning/files/penguin-census.csv
@@ -0,0 +1,345 @@
+"species","island","bill_length_mm","bill_depth_mm","flipper_length_mm","body_mass_g","sex","year"
+"Adelie","Torgersen",39.1,18.7,181,3750,"male",2007
+"Adelie","Torgersen",39.5,17.4,186,3800,"female",2007
+"Adelie","Torgersen",40.3,18,195,3250,"female",2007
+"Adelie","Torgersen",NA,NA,NA,NA,NA,2007
+"Adelie","Torgersen",36.7,19.3,193,3450,"female",2007
+"Adelie","Torgersen",39.3,20.6,190,3650,"male",2007
+"Adelie","Torgersen",38.9,17.8,181,3625,"female",2007
+"Adelie","Torgersen",39.2,19.6,195,4675,"male",2007
+"Adelie","Torgersen",34.1,18.1,193,3475,NA,2007
+"Adelie","Torgersen",42,20.2,190,4250,NA,2007
+"Adelie","Torgersen",37.8,17.1,186,3300,NA,2007
+"Adelie","Torgersen",37.8,17.3,180,3700,NA,2007
+"Adelie","Torgersen",41.1,17.6,182,3200,"female",2007
+"Adelie","Torgersen",38.6,21.2,191,3800,"male",2007
+"Adelie","Torgersen",34.6,21.1,198,4400,"male",2007
+"Adelie","Torgersen",36.6,17.8,185,3700,"female",2007
+"Adelie","Torgersen",38.7,19,195,3450,"female",2007
+"Adelie","Torgersen",42.5,20.7,197,4500,"male",2007
+"Adelie","Torgersen",34.4,18.4,184,3325,"female",2007
+"Adelie","Torgersen",46,21.5,194,4200,"male",2007
+"Adelie","Biscoe",37.8,18.3,174,3400,"female",2007
+"Adelie","Biscoe",37.7,18.7,180,3600,"male",2007
+"Adelie","Biscoe",35.9,19.2,189,3800,"female",2007
+"Adelie","Biscoe",38.2,18.1,185,3950,"male",2007
+"Adelie","Biscoe",38.8,17.2,180,3800,"male",2007
+"Adelie","Biscoe",35.3,18.9,187,3800,"female",2007
+"Adelie","Biscoe",40.6,18.6,183,3550,"male",2007
+"Adelie","Biscoe",40.5,17.9,187,3200,"female",2007
+"Adelie","Biscoe",37.9,18.6,172,3150,"female",2007
+"Adelie","Biscoe",40.5,18.9,180,3950,"male",2007
+"Adelie","Dream",39.5,16.7,178,3250,"female",2007
+"Adelie","Dream",37.2,18.1,178,3900,"male",2007
+"Adelie","Dream",39.5,17.8,188,3300,"female",2007
+"Adelie","Dream",40.9,18.9,184,3900,"male",2007
+"Adelie","Dream",36.4,17,195,3325,"female",2007
+"Adelie","Dream",39.2,21.1,196,4150,"male",2007
+"Adelie","Dream",38.8,20,190,3950,"male",2007
+"Adelie","Dream",42.2,18.5,180,3550,"female",2007
+"Adelie","Dream",37.6,19.3,181,3300,"female",2007
+"Adelie","Dream",39.8,19.1,184,4650,"male",2007
+"Adelie","Dream",36.5,18,182,3150,"female",2007
+"Adelie","Dream",40.8,18.4,195,3900,"male",2007
+"Adelie","Dream",36,18.5,186,3100,"female",2007
+"Adelie","Dream",44.1,19.7,196,4400,"male",2007
+"Adelie","Dream",37,16.9,185,3000,"female",2007
+"Adelie","Dream",39.6,18.8,190,4600,"male",2007
+"Adelie","Dream",41.1,19,182,3425,"male",2007
+"Adelie","Dream",37.5,18.9,179,2975,NA,2007
+"Adelie","Dream",36,17.9,190,3450,"female",2007
+"Adelie","Dream",42.3,21.2,191,4150,"male",2007
+"Adelie","Biscoe",39.6,17.7,186,3500,"female",2008
+"Adelie","Biscoe",40.1,18.9,188,4300,"male",2008
+"Adelie","Biscoe",35,17.9,190,3450,"female",2008
+"Adelie","Biscoe",42,19.5,200,4050,"male",2008
+"Adelie","Biscoe",34.5,18.1,187,2900,"female",2008
+"Adelie","Biscoe",41.4,18.6,191,3700,"male",2008
+"Adelie","Biscoe",39,17.5,186,3550,"female",2008
+"Adelie","Biscoe",40.6,18.8,193,3800,"male",2008
+"Adelie","Biscoe",36.5,16.6,181,2850,"female",2008
+"Adelie","Biscoe",37.6,19.1,194,3750,"male",2008
+"Adelie","Biscoe",35.7,16.9,185,3150,"female",2008
+"Adelie","Biscoe",41.3,21.1,195,4400,"male",2008
+"Adelie","Biscoe",37.6,17,185,3600,"female",2008
+"Adelie","Biscoe",41.1,18.2,192,4050,"male",2008
+"Adelie","Biscoe",36.4,17.1,184,2850,"female",2008
+"Adelie","Biscoe",41.6,18,192,3950,"male",2008
+"Adelie","Biscoe",35.5,16.2,195,3350,"female",2008
+"Adelie","Biscoe",41.1,19.1,188,4100,"male",2008
+"Adelie","Torgersen",35.9,16.6,190,3050,"female",2008
+"Adelie","Torgersen",41.8,19.4,198,4450,"male",2008
+"Adelie","Torgersen",33.5,19,190,3600,"female",2008
+"Adelie","Torgersen",39.7,18.4,190,3900,"male",2008
+"Adelie","Torgersen",39.6,17.2,196,3550,"female",2008
+"Adelie","Torgersen",45.8,18.9,197,4150,"male",2008
+"Adelie","Torgersen",35.5,17.5,190,3700,"female",2008
+"Adelie","Torgersen",42.8,18.5,195,4250,"male",2008
+"Adelie","Torgersen",40.9,16.8,191,3700,"female",2008
+"Adelie","Torgersen",37.2,19.4,184,3900,"male",2008
+"Adelie","Torgersen",36.2,16.1,187,3550,"female",2008
+"Adelie","Torgersen",42.1,19.1,195,4000,"male",2008
+"Adelie","Torgersen",34.6,17.2,189,3200,"female",2008
+"Adelie","Torgersen",42.9,17.6,196,4700,"male",2008
+"Adelie","Torgersen",36.7,18.8,187,3800,"female",2008
+"Adelie","Torgersen",35.1,19.4,193,4200,"male",2008
+"Adelie","Dream",37.3,17.8,191,3350,"female",2008
+"Adelie","Dream",41.3,20.3,194,3550,"male",2008
+"Adelie","Dream",36.3,19.5,190,3800,"male",2008
+"Adelie","Dream",36.9,18.6,189,3500,"female",2008
+"Adelie","Dream",38.3,19.2,189,3950,"male",2008
+"Adelie","Dream",38.9,18.8,190,3600,"female",2008
+"Adelie","Dream",35.7,18,202,3550,"female",2008
+"Adelie","Dream",41.1,18.1,205,4300,"male",2008
+"Adelie","Dream",34,17.1,185,3400,"female",2008
+"Adelie","Dream",39.6,18.1,186,4450,"male",2008
+"Adelie","Dream",36.2,17.3,187,3300,"female",2008
+"Adelie","Dream",40.8,18.9,208,4300,"male",2008
+"Adelie","Dream",38.1,18.6,190,3700,"female",2008
+"Adelie","Dream",40.3,18.5,196,4350,"male",2008
+"Adelie","Dream",33.1,16.1,178,2900,"female",2008
+"Adelie","Dream",43.2,18.5,192,4100,"male",2008
+"Adelie","Biscoe",35,17.9,192,3725,"female",2009
+"Adelie","Biscoe",41,20,203,4725,"male",2009
+"Adelie","Biscoe",37.7,16,183,3075,"female",2009
+"Adelie","Biscoe",37.8,20,190,4250,"male",2009
+"Adelie","Biscoe",37.9,18.6,193,2925,"female",2009
+"Adelie","Biscoe",39.7,18.9,184,3550,"male",2009
+"Adelie","Biscoe",38.6,17.2,199,3750,"female",2009
+"Adelie","Biscoe",38.2,20,190,3900,"male",2009
+"Adelie","Biscoe",38.1,17,181,3175,"female",2009
+"Adelie","Biscoe",43.2,19,197,4775,"male",2009
+"Adelie","Biscoe",38.1,16.5,198,3825,"female",2009
+"Adelie","Biscoe",45.6,20.3,191,4600,"male",2009
+"Adelie","Biscoe",39.7,17.7,193,3200,"female",2009
+"Adelie","Biscoe",42.2,19.5,197,4275,"male",2009
+"Adelie","Biscoe",39.6,20.7,191,3900,"female",2009
+"Adelie","Biscoe",42.7,18.3,196,4075,"male",2009
+"Adelie","Torgersen",38.6,17,188,2900,"female",2009
+"Adelie","Torgersen",37.3,20.5,199,3775,"male",2009
+"Adelie","Torgersen",35.7,17,189,3350,"female",2009
+"Adelie","Torgersen",41.1,18.6,189,3325,"male",2009
+"Adelie","Torgersen",36.2,17.2,187,3150,"female",2009
+"Adelie","Torgersen",37.7,19.8,198,3500,"male",2009
+"Adelie","Torgersen",40.2,17,176,3450,"female",2009
+"Adelie","Torgersen",41.4,18.5,202,3875,"male",2009
+"Adelie","Torgersen",35.2,15.9,186,3050,"female",2009
+"Adelie","Torgersen",40.6,19,199,4000,"male",2009
+"Adelie","Torgersen",38.8,17.6,191,3275,"female",2009
+"Adelie","Torgersen",41.5,18.3,195,4300,"male",2009
+"Adelie","Torgersen",39,17.1,191,3050,"female",2009
+"Adelie","Torgersen",44.1,18,210,4000,"male",2009
+"Adelie","Torgersen",38.5,17.9,190,3325,"female",2009
+"Adelie","Torgersen",43.1,19.2,197,3500,"male",2009
+"Adelie","Dream",36.8,18.5,193,3500,"female",2009
+"Adelie","Dream",37.5,18.5,199,4475,"male",2009
+"Adelie","Dream",38.1,17.6,187,3425,"female",2009
+"Adelie","Dream",41.1,17.5,190,3900,"male",2009
+"Adelie","Dream",35.6,17.5,191,3175,"female",2009
+"Adelie","Dream",40.2,20.1,200,3975,"male",2009
+"Adelie","Dream",37,16.5,185,3400,"female",2009
+"Adelie","Dream",39.7,17.9,193,4250,"male",2009
+"Adelie","Dream",40.2,17.1,193,3400,"female",2009
+"Adelie","Dream",40.6,17.2,187,3475,"male",2009
+"Adelie","Dream",32.1,15.5,188,3050,"female",2009
+"Adelie","Dream",40.7,17,190,3725,"male",2009
+"Adelie","Dream",37.3,16.8,192,3000,"female",2009
+"Adelie","Dream",39,18.7,185,3650,"male",2009
+"Adelie","Dream",39.2,18.6,190,4250,"male",2009
+"Adelie","Dream",36.6,18.4,184,3475,"female",2009
+"Adelie","Dream",36,17.8,195,3450,"female",2009
+"Adelie","Dream",37.8,18.1,193,3750,"male",2009
+"Adelie","Dream",36,17.1,187,3700,"female",2009
+"Adelie","Dream",41.5,18.5,201,4000,"male",2009
+"Gentoo","Biscoe",46.1,13.2,211,4500,"female",2007
+"Gentoo","Biscoe",50,16.3,230,5700,"male",2007
+"Gentoo","Biscoe",48.7,14.1,210,4450,"female",2007
+"Gentoo","Biscoe",50,15.2,218,5700,"male",2007
+"Gentoo","Biscoe",47.6,14.5,215,5400,"male",2007
+"Gentoo","Biscoe",46.5,13.5,210,4550,"female",2007
+"Gentoo","Biscoe",45.4,14.6,211,4800,"female",2007
+"Gentoo","Biscoe",46.7,15.3,219,5200,"male",2007
+"Gentoo","Biscoe",43.3,13.4,209,4400,"female",2007
+"Gentoo","Biscoe",46.8,15.4,215,5150,"male",2007
+"Gentoo","Biscoe",40.9,13.7,214,4650,"female",2007
+"Gentoo","Biscoe",49,16.1,216,5550,"male",2007
+"Gentoo","Biscoe",45.5,13.7,214,4650,"female",2007
+"Gentoo","Biscoe",48.4,14.6,213,5850,"male",2007
+"Gentoo","Biscoe",45.8,14.6,210,4200,"female",2007
+"Gentoo","Biscoe",49.3,15.7,217,5850,"male",2007
+"Gentoo","Biscoe",42,13.5,210,4150,"female",2007
+"Gentoo","Biscoe",49.2,15.2,221,6300,"male",2007
+"Gentoo","Biscoe",46.2,14.5,209,4800,"female",2007
+"Gentoo","Biscoe",48.7,15.1,222,5350,"male",2007
+"Gentoo","Biscoe",50.2,14.3,218,5700,"male",2007
+"Gentoo","Biscoe",45.1,14.5,215,5000,"female",2007
+"Gentoo","Biscoe",46.5,14.5,213,4400,"female",2007
+"Gentoo","Biscoe",46.3,15.8,215,5050,"male",2007
+"Gentoo","Biscoe",42.9,13.1,215,5000,"female",2007
+"Gentoo","Biscoe",46.1,15.1,215,5100,"male",2007
+"Gentoo","Biscoe",44.5,14.3,216,4100,NA,2007
+"Gentoo","Biscoe",47.8,15,215,5650,"male",2007
+"Gentoo","Biscoe",48.2,14.3,210,4600,"female",2007
+"Gentoo","Biscoe",50,15.3,220,5550,"male",2007
+"Gentoo","Biscoe",47.3,15.3,222,5250,"male",2007
+"Gentoo","Biscoe",42.8,14.2,209,4700,"female",2007
+"Gentoo","Biscoe",45.1,14.5,207,5050,"female",2007
+"Gentoo","Biscoe",59.6,17,230,6050,"male",2007
+"Gentoo","Biscoe",49.1,14.8,220,5150,"female",2008
+"Gentoo","Biscoe",48.4,16.3,220,5400,"male",2008
+"Gentoo","Biscoe",42.6,13.7,213,4950,"female",2008
+"Gentoo","Biscoe",44.4,17.3,219,5250,"male",2008
+"Gentoo","Biscoe",44,13.6,208,4350,"female",2008
+"Gentoo","Biscoe",48.7,15.7,208,5350,"male",2008
+"Gentoo","Biscoe",42.7,13.7,208,3950,"female",2008
+"Gentoo","Biscoe",49.6,16,225,5700,"male",2008
+"Gentoo","Biscoe",45.3,13.7,210,4300,"female",2008
+"Gentoo","Biscoe",49.6,15,216,4750,"male",2008
+"Gentoo","Biscoe",50.5,15.9,222,5550,"male",2008
+"Gentoo","Biscoe",43.6,13.9,217,4900,"female",2008
+"Gentoo","Biscoe",45.5,13.9,210,4200,"female",2008
+"Gentoo","Biscoe",50.5,15.9,225,5400,"male",2008
+"Gentoo","Biscoe",44.9,13.3,213,5100,"female",2008
+"Gentoo","Biscoe",45.2,15.8,215,5300,"male",2008
+"Gentoo","Biscoe",46.6,14.2,210,4850,"female",2008
+"Gentoo","Biscoe",48.5,14.1,220,5300,"male",2008
+"Gentoo","Biscoe",45.1,14.4,210,4400,"female",2008
+"Gentoo","Biscoe",50.1,15,225,5000,"male",2008
+"Gentoo","Biscoe",46.5,14.4,217,4900,"female",2008
+"Gentoo","Biscoe",45,15.4,220,5050,"male",2008
+"Gentoo","Biscoe",43.8,13.9,208,4300,"female",2008
+"Gentoo","Biscoe",45.5,15,220,5000,"male",2008
+"Gentoo","Biscoe",43.2,14.5,208,4450,"female",2008
+"Gentoo","Biscoe",50.4,15.3,224,5550,"male",2008
+"Gentoo","Biscoe",45.3,13.8,208,4200,"female",2008
+"Gentoo","Biscoe",46.2,14.9,221,5300,"male",2008
+"Gentoo","Biscoe",45.7,13.9,214,4400,"female",2008
+"Gentoo","Biscoe",54.3,15.7,231,5650,"male",2008
+"Gentoo","Biscoe",45.8,14.2,219,4700,"female",2008
+"Gentoo","Biscoe",49.8,16.8,230,5700,"male",2008
+"Gentoo","Biscoe",46.2,14.4,214,4650,NA,2008
+"Gentoo","Biscoe",49.5,16.2,229,5800,"male",2008
+"Gentoo","Biscoe",43.5,14.2,220,4700,"female",2008
+"Gentoo","Biscoe",50.7,15,223,5550,"male",2008
+"Gentoo","Biscoe",47.7,15,216,4750,"female",2008
+"Gentoo","Biscoe",46.4,15.6,221,5000,"male",2008
+"Gentoo","Biscoe",48.2,15.6,221,5100,"male",2008
+"Gentoo","Biscoe",46.5,14.8,217,5200,"female",2008
+"Gentoo","Biscoe",46.4,15,216,4700,"female",2008
+"Gentoo","Biscoe",48.6,16,230,5800,"male",2008
+"Gentoo","Biscoe",47.5,14.2,209,4600,"female",2008
+"Gentoo","Biscoe",51.1,16.3,220,6000,"male",2008
+"Gentoo","Biscoe",45.2,13.8,215,4750,"female",2008
+"Gentoo","Biscoe",45.2,16.4,223,5950,"male",2008
+"Gentoo","Biscoe",49.1,14.5,212,4625,"female",2009
+"Gentoo","Biscoe",52.5,15.6,221,5450,"male",2009
+"Gentoo","Biscoe",47.4,14.6,212,4725,"female",2009
+"Gentoo","Biscoe",50,15.9,224,5350,"male",2009
+"Gentoo","Biscoe",44.9,13.8,212,4750,"female",2009
+"Gentoo","Biscoe",50.8,17.3,228,5600,"male",2009
+"Gentoo","Biscoe",43.4,14.4,218,4600,"female",2009
+"Gentoo","Biscoe",51.3,14.2,218,5300,"male",2009
+"Gentoo","Biscoe",47.5,14,212,4875,"female",2009
+"Gentoo","Biscoe",52.1,17,230,5550,"male",2009
+"Gentoo","Biscoe",47.5,15,218,4950,"female",2009
+"Gentoo","Biscoe",52.2,17.1,228,5400,"male",2009
+"Gentoo","Biscoe",45.5,14.5,212,4750,"female",2009
+"Gentoo","Biscoe",49.5,16.1,224,5650,"male",2009
+"Gentoo","Biscoe",44.5,14.7,214,4850,"female",2009
+"Gentoo","Biscoe",50.8,15.7,226,5200,"male",2009
+"Gentoo","Biscoe",49.4,15.8,216,4925,"male",2009
+"Gentoo","Biscoe",46.9,14.6,222,4875,"female",2009
+"Gentoo","Biscoe",48.4,14.4,203,4625,"female",2009
+"Gentoo","Biscoe",51.1,16.5,225,5250,"male",2009
+"Gentoo","Biscoe",48.5,15,219,4850,"female",2009
+"Gentoo","Biscoe",55.9,17,228,5600,"male",2009
+"Gentoo","Biscoe",47.2,15.5,215,4975,"female",2009
+"Gentoo","Biscoe",49.1,15,228,5500,"male",2009
+"Gentoo","Biscoe",47.3,13.8,216,4725,NA,2009
+"Gentoo","Biscoe",46.8,16.1,215,5500,"male",2009
+"Gentoo","Biscoe",41.7,14.7,210,4700,"female",2009
+"Gentoo","Biscoe",53.4,15.8,219,5500,"male",2009
+"Gentoo","Biscoe",43.3,14,208,4575,"female",2009
+"Gentoo","Biscoe",48.1,15.1,209,5500,"male",2009
+"Gentoo","Biscoe",50.5,15.2,216,5000,"female",2009
+"Gentoo","Biscoe",49.8,15.9,229,5950,"male",2009
+"Gentoo","Biscoe",43.5,15.2,213,4650,"female",2009
+"Gentoo","Biscoe",51.5,16.3,230,5500,"male",2009
+"Gentoo","Biscoe",46.2,14.1,217,4375,"female",2009
+"Gentoo","Biscoe",55.1,16,230,5850,"male",2009
+"Gentoo","Biscoe",44.5,15.7,217,4875,NA,2009
+"Gentoo","Biscoe",48.8,16.2,222,6000,"male",2009
+"Gentoo","Biscoe",47.2,13.7,214,4925,"female",2009
+"Gentoo","Biscoe",NA,NA,NA,NA,NA,2009
+"Gentoo","Biscoe",46.8,14.3,215,4850,"female",2009
+"Gentoo","Biscoe",50.4,15.7,222,5750,"male",2009
+"Gentoo","Biscoe",45.2,14.8,212,5200,"female",2009
+"Gentoo","Biscoe",49.9,16.1,213,5400,"male",2009
+"Chinstrap","Dream",46.5,17.9,192,3500,"female",2007
+"Chinstrap","Dream",50,19.5,196,3900,"male",2007
+"Chinstrap","Dream",51.3,19.2,193,3650,"male",2007
+"Chinstrap","Dream",45.4,18.7,188,3525,"female",2007
+"Chinstrap","Dream",52.7,19.8,197,3725,"male",2007
+"Chinstrap","Dream",45.2,17.8,198,3950,"female",2007
+"Chinstrap","Dream",46.1,18.2,178,3250,"female",2007
+"Chinstrap","Dream",51.3,18.2,197,3750,"male",2007
+"Chinstrap","Dream",46,18.9,195,4150,"female",2007
+"Chinstrap","Dream",51.3,19.9,198,3700,"male",2007
+"Chinstrap","Dream",46.6,17.8,193,3800,"female",2007
+"Chinstrap","Dream",51.7,20.3,194,3775,"male",2007
+"Chinstrap","Dream",47,17.3,185,3700,"female",2007
+"Chinstrap","Dream",52,18.1,201,4050,"male",2007
+"Chinstrap","Dream",45.9,17.1,190,3575,"female",2007
+"Chinstrap","Dream",50.5,19.6,201,4050,"male",2007
+"Chinstrap","Dream",50.3,20,197,3300,"male",2007
+"Chinstrap","Dream",58,17.8,181,3700,"female",2007
+"Chinstrap","Dream",46.4,18.6,190,3450,"female",2007
+"Chinstrap","Dream",49.2,18.2,195,4400,"male",2007
+"Chinstrap","Dream",42.4,17.3,181,3600,"female",2007
+"Chinstrap","Dream",48.5,17.5,191,3400,"male",2007
+"Chinstrap","Dream",43.2,16.6,187,2900,"female",2007
+"Chinstrap","Dream",50.6,19.4,193,3800,"male",2007
+"Chinstrap","Dream",46.7,17.9,195,3300,"female",2007
+"Chinstrap","Dream",52,19,197,4150,"male",2007
+"Chinstrap","Dream",50.5,18.4,200,3400,"female",2008
+"Chinstrap","Dream",49.5,19,200,3800,"male",2008
+"Chinstrap","Dream",46.4,17.8,191,3700,"female",2008
+"Chinstrap","Dream",52.8,20,205,4550,"male",2008
+"Chinstrap","Dream",40.9,16.6,187,3200,"female",2008
+"Chinstrap","Dream",54.2,20.8,201,4300,"male",2008
+"Chinstrap","Dream",42.5,16.7,187,3350,"female",2008
+"Chinstrap","Dream",51,18.8,203,4100,"male",2008
+"Chinstrap","Dream",49.7,18.6,195,3600,"male",2008
+"Chinstrap","Dream",47.5,16.8,199,3900,"female",2008
+"Chinstrap","Dream",47.6,18.3,195,3850,"female",2008
+"Chinstrap","Dream",52,20.7,210,4800,"male",2008
+"Chinstrap","Dream",46.9,16.6,192,2700,"female",2008
+"Chinstrap","Dream",53.5,19.9,205,4500,"male",2008
+"Chinstrap","Dream",49,19.5,210,3950,"male",2008
+"Chinstrap","Dream",46.2,17.5,187,3650,"female",2008
+"Chinstrap","Dream",50.9,19.1,196,3550,"male",2008
+"Chinstrap","Dream",45.5,17,196,3500,"female",2008
+"Chinstrap","Dream",50.9,17.9,196,3675,"female",2009
+"Chinstrap","Dream",50.8,18.5,201,4450,"male",2009
+"Chinstrap","Dream",50.1,17.9,190,3400,"female",2009
+"Chinstrap","Dream",49,19.6,212,4300,"male",2009
+"Chinstrap","Dream",51.5,18.7,187,3250,"male",2009
+"Chinstrap","Dream",49.8,17.3,198,3675,"female",2009
+"Chinstrap","Dream",48.1,16.4,199,3325,"female",2009
+"Chinstrap","Dream",51.4,19,201,3950,"male",2009
+"Chinstrap","Dream",45.7,17.3,193,3600,"female",2009
+"Chinstrap","Dream",50.7,19.7,203,4050,"male",2009
+"Chinstrap","Dream",42.5,17.3,187,3350,"female",2009
+"Chinstrap","Dream",52.2,18.8,197,3450,"male",2009
+"Chinstrap","Dream",45.2,16.6,191,3250,"female",2009
+"Chinstrap","Dream",49.3,19.9,203,4050,"male",2009
+"Chinstrap","Dream",50.2,18.8,202,3800,"male",2009
+"Chinstrap","Dream",45.6,19.4,194,3525,"female",2009
+"Chinstrap","Dream",51.9,19.5,206,3950,"male",2009
+"Chinstrap","Dream",46.8,16.5,189,3650,"female",2009
+"Chinstrap","Dream",45.7,17,195,3650,"female",2009
+"Chinstrap","Dream",55.8,19.8,207,4000,"male",2009
+"Chinstrap","Dream",43.5,18.1,202,3400,"female",2009
+"Chinstrap","Dream",49.6,18.2,193,3775,"male",2009
+"Chinstrap","Dream",50.8,19,210,4100,"male",2009
+"Chinstrap","Dream",50.2,18.7,198,3775,"female",2009
\ No newline at end of file
diff --git a/notebooks/machine-learning/files/stellar-objects.csv b/notebooks/machine-learning/files/stellar-objects.csv
new file mode 100644
index 0000000000000000000000000000000000000000..787741ebef15123f0d05020b21c6b6a85b3a0586
--- /dev/null
+++ b/notebooks/machine-learning/files/stellar-objects.csv
@@ -0,0 +1,97 @@
+object	distance	v_helio	v_flow	v_cmb
+Abell 1367	89.2		6845	
+Abell 194	55.9 		5208	
+Abell 262	66.7		5091	
+Abell 2634	114.9		9142	
+Abell 33816	129.8		11436	
+Abell 3574	51.6 		4617	
+Abell 400	88.4		6983	
+Abell 539	102.0 		8648	
+Abell S639	59.6 		6577	
+Abell S753	49.7 		3973	
+Antlia	45.1		2821	
+Cancer	74.3		4942	
+Cen 30	43.2		4445	
+Cen 45	68.2		4408	
+Coma	85.8 		7392	
+DC 2345-28	102.1 		8708	
+Dorado	13.8		1064	
+Eridanus	20.7		1627	
+ESO 50	39.5		2896	
+Fornax	15.0		1372	
+GRM 4530	47.4 		4848	
+Hydra	58.3		3881	
+Hydra I	49.1 		3881	
+IC 0429	55.5		3341	
+IC 4182	4.49	321		
+MDL 59	31.3		2664	
+NGC 0300	2	144		
+NGC 0708	68.2		4831	
+NGC 0925	9.16	553		
+NGC 1326A	16.14	1836		
+NGC 1365	17.95	1636		
+NGC 1425	21.88	1512		
+NGC 2090	11.75	931		
+NGC 2403	3.22	131		
+NGC 2541	11.22	559		
+NGC 3031	3.63	-34		
+NGC 3198	13.8	662		
+NGC 3351	10	778		
+NGC 3368	10.52	897		
+NGC 3557	38.7		2957	
+NGC 3621	6.64	805		
+NGC 383	66.6		5326	
+NGC 4321	15.21	1571		
+NGC 4373	36.3		3118	
+NGC 4414	17.7	716		
+NGC 4496A	14.86	1730		
+NGC 4535	15.78	1961		
+NGC 4536	14.93	1804		
+NGC 4548	16.22	486		
+NGC 4639	21.98	1010		
+NGC 4725	12.36	1206		
+NGC 4881	102.3		7441	
+NGC 507	57.3		5257	
+NGC 5193	51.5		3468	
+NGC 5253	3.15	404		
+NGC 7014	67.3		5061	
+NGC 7731	14.72	816		
+Pavo 2	50.9		4646	
+Pegasus	53.3		3874	
+SN1990af	198.6			15055
+SN1990O	134.7			9065
+SN1990T	158.9			12012
+SN1991ag	56.0			4124
+SN1991S	238.9			16687
+SN1991U	117.1			9801
+SN1992ae	274.6			22426
+SN1992ag	102.1			7765
+SN1992al	58.0			4227
+SN1992aq	467.0			30253
+SN1992au	262.2			18212
+SN1992bc	88.6			5935
+SN1992bg	151.4			10696
+SN1992bh	202.5			13518
+SN1992bk	235.9			17371
+SN1992bl	176.8			12871
+SN1992bo	77.9			5434
+SN1992bp	309.5			23646
+SN1992br	391.5			26318
+SN1992bs	280.1			18997
+SN1992J	183.9			13707
+SN1992P	121.5			7880
+SN1993ac	202.3			14764
+SN1993ae	71.8			5424
+SN1993ag	215.4			15002
+SN1993ah	119.7			8604
+SN1993B	303.4			21190
+SN1993O	236.1			15567
+SN1994M	96.7			7241
+SN1994Q	127.8			8691
+SN1994S	66.8			4847
+SN1994T	149.9			10715
+SN1995ac	185.6			14634
+SN1995ak	82.4			6673
+SN1996bl	132.7			10446
+SN1996C	136.0			9024
+Ursa Major	19.8		1088	
diff --git a/notebooks/machine-learning/images/bill-dimensions.png b/notebooks/machine-learning/images/bill-dimensions.png
new file mode 100644
index 0000000000000000000000000000000000000000..10025a39c6f0cdb3d404701722ded32b594ab82c
Binary files /dev/null and b/notebooks/machine-learning/images/bill-dimensions.png differ
diff --git a/notebooks/machine-learning/images/birth-barplot.png b/notebooks/machine-learning/images/birth-barplot.png
new file mode 100644
index 0000000000000000000000000000000000000000..00e1c3af8a4a6b2d28944a475b6ba5dea84cce33
Binary files /dev/null and b/notebooks/machine-learning/images/birth-barplot.png differ
diff --git a/notebooks/machine-learning/images/birth-histogram.png b/notebooks/machine-learning/images/birth-histogram.png
new file mode 100644
index 0000000000000000000000000000000000000000..d2ec8066bdbce74b973f068cf234ba75e66aa2db
Binary files /dev/null and b/notebooks/machine-learning/images/birth-histogram.png differ
diff --git a/notebooks/machine-learning/images/color-brewer.png b/notebooks/machine-learning/images/color-brewer.png
new file mode 100644
index 0000000000000000000000000000000000000000..754b63b39307e3d37d2e0b9a55c303ed9812a0dc
Binary files /dev/null and b/notebooks/machine-learning/images/color-brewer.png differ
diff --git a/notebooks/machine-learning/images/davis-wh20.png b/notebooks/machine-learning/images/davis-wh20.png
new file mode 100644
index 0000000000000000000000000000000000000000..dd1b0e2476dd35415e1025e6144ab462293c4d6b
Binary files /dev/null and b/notebooks/machine-learning/images/davis-wh20.png differ
diff --git a/notebooks/machine-learning/images/davis-wh40.png b/notebooks/machine-learning/images/davis-wh40.png
new file mode 100644
index 0000000000000000000000000000000000000000..ed755346b0b5c925244846c6ff29a4b51bec07dd
Binary files /dev/null and b/notebooks/machine-learning/images/davis-wh40.png differ
diff --git a/notebooks/machine-learning/images/evolution-autism.png b/notebooks/machine-learning/images/evolution-autism.png
new file mode 100644
index 0000000000000000000000000000000000000000..4b51f690a62237e7899a6bd662efa6126ef9b407
Binary files /dev/null and b/notebooks/machine-learning/images/evolution-autism.png differ
diff --git a/notebooks/machine-learning/images/penguin-dimensions.png b/notebooks/machine-learning/images/penguin-dimensions.png
new file mode 100644
index 0000000000000000000000000000000000000000..787bc63250fa8a561afa043cb39a7009c6e41f44
Binary files /dev/null and b/notebooks/machine-learning/images/penguin-dimensions.png differ
diff --git a/notebooks/machine-learning/images/penguin-tree.png b/notebooks/machine-learning/images/penguin-tree.png
new file mode 100644
index 0000000000000000000000000000000000000000..e33408c8be233cbaf23a0fa6ea72b496ef96b07e
Binary files /dev/null and b/notebooks/machine-learning/images/penguin-tree.png differ
diff --git a/notebooks/machine-learning/images/penguins.png b/notebooks/machine-learning/images/penguins.png
new file mode 100644
index 0000000000000000000000000000000000000000..736ae89b686339afd6af80214a7d7a05b64ef64e
Binary files /dev/null and b/notebooks/machine-learning/images/penguins.png differ
diff --git a/notebooks/machine-learning/stellar-objects.ipynb b/notebooks/machine-learning/stellar-objects.ipynb
new file mode 100644
index 0000000000000000000000000000000000000000..816a825d12ff8d25388b1930e8527ccaf7dd0d81
--- /dev/null
+++ b/notebooks/machine-learning/stellar-objects.ipynb
@@ -0,0 +1,86 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4bbfb43b-1feb-4366-b1e7-5536f0f5aacd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import matplotlib.pyplot as plt\n",
+    "import pandas as pd\n",
+    "import seaborn as sns"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1cf3ab56-418f-46e3-bc3f-36cf0eec0dbf",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# distance: megaparsec (MPC)\n",
+    "# velocity: in km/s\n",
+    "df = pd.read_csv(\"./galaxies.csv\", sep=\"\\t\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f6e8f95c-1da6-4d6f-b8c0-a6aa5cdddc2d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df[\"velocity\"] = df.v_helio.fillna(df.v_flow.fillna(df.v_cmb))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "8396a4ef-9f1f-425a-886e-8d2bf3d979ee",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.title(\"Relation between distance and velocity of stellar objects\")\n",
+    "plt.xlabel(\"Distance (MPC)\")\n",
+    "plt.ylabel(\"Velocity (km/s)\")\n",
+    "\n",
+    "#sns.scatterplot(data=df, x=\"distance\", y=\"velocity\", color=\"orange\")\n",
+    "sns.regplot(data=df, x=\"distance\", y=\"velocity\", color=\"orange\")\n",
+    "\n",
+    "sns.despine()\n",
+    "\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6d94f18a-29ec-4da1-b542-b498e3017d2d",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}