Commit e30704ff authored by Céline Meillier's avatar Céline Meillier

suppression cellules inutiles

parent 3fea3f0f
......@@ -748,181 +748,21 @@
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"**Nom des variables stockées** "
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "fragment"
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Index(['director_name', 'num_critic_for_reviews', 'duration', 'actor_1_name',\n",
" 'actor_2_name', 'num_voted_users', 'facenumber_in_poster',\n",
" 'num_user_for_reviews', 'language', 'country', 'content_rating',\n",
" 'budget', 'title_year', 'imdb_score'],\n",
" dtype='object')\n"
]
}
],
"source": [
"print(DATA.keys())"
"# 3. Les différents types de données"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"#### Sélection de variables grâce à l'outil `DataFrame` \n",
"\n",
"La librairie `pandas` possède une classe `DataFrame` munie d'un grand nombre de fonctions dédiées à l'analyse de tableau de données. Dans l'exemple ci-dessous, 2 variables quantitatives sont extraites de la base de données : le score et le budget, par exemple pour étudier la relation entre ces deux variable. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"scrolled": true,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>imdb_score</th>\n",
" <th>budget</th>\n",
" </tr>\n",
" <tr>\n",
" <th>movie_title</th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>The Shawshank Redemption</th>\n",
" <td>9.3</td>\n",
" <td>25000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Dark Knight</th>\n",
" <td>9.0</td>\n",
" <td>185000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Inception</th>\n",
" <td>8.8</td>\n",
" <td>160000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Fight Club</th>\n",
" <td>8.8</td>\n",
" <td>63000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Pulp Fiction</th>\n",
" <td>8.9</td>\n",
" <td>8000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>Forrest Gump</th>\n",
" <td>8.8</td>\n",
" <td>55000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Lord of the Rings: The Fellowship of the Ring</th>\n",
" <td>8.8</td>\n",
" <td>93000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Matrix</th>\n",
" <td>8.7</td>\n",
" <td>63000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Lord of the Rings: The Return of the King</th>\n",
" <td>8.9</td>\n",
" <td>94000000</td>\n",
" </tr>\n",
" <tr>\n",
" <th>The Godfather</th>\n",
" <td>9.2</td>\n",
" <td>6000000</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" imdb_score budget\n",
"movie_title \n",
"The Shawshank Redemption 9.3 25000000\n",
"The Dark Knight 9.0 185000000\n",
"Inception 8.8 160000000\n",
"Fight Club 8.8 63000000\n",
"Pulp Fiction 8.9 8000000\n",
"Forrest Gump 8.8 55000000\n",
"The Lord of the Rings: The Fellowship of the Ring 8.8 93000000\n",
"The Matrix 8.7 63000000\n",
"The Lord of the Rings: The Return of the King 8.9 94000000\n",
"The Godfather 9.2 6000000"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame(DATA, columns=['imdb_score', 'budget'])\n",
"df.head(n = 10)# permet d'afficher les 10 premières lignes du tableau de données pour les colonnes sélectionnées "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# 3. Les différents types de données\n",
"\n",
"### 3.1. Données quantitatives\n",
"\n",
"Ce sont des quantités mesurables sur lesquelles on peut faire des calculs et des comparaisons : $ f(x), \\div, \\times, =, \\neq, \\leqslant, \\geqslant$. \n",
......@@ -942,7 +782,7 @@
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
"slide_type": "subslide"
}
},
"source": [
......@@ -974,13 +814,6 @@
"\n",
"Livre [Data science : fondamentaux et études de cas - Machine learning avec Python et R](https://www.eyrolles.com/Chapitres/9782212142433/9782212142433.pdf)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment