mirror of
https://github.com/elastic/eland.git
synced 2025-07-11 00:02:14 +08:00
* Resolving inconsistent __repr__ test on python 3.5 * Fixing layout for info_es + adding Series.hist doc
1466 lines
62 KiB
Plaintext
1466 lines
62 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import eland as ed\n",
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"# Fix console size for consistent test results\n",
|
||
"from eland.conftest import *"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Online Retail Analysis"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Getting Started\n",
|
||
"\n",
|
||
"To get started, let's create an `eland.DataFrame` by reading a csv file. This creates and populates the \n",
|
||
"`online-retail` index in the local Elasticsearch cluster."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"df = ed.read_csv(\"data/online-retail.csv.gz\",\n",
|
||
" es_client='localhost', \n",
|
||
" es_dest_index='online-retail', \n",
|
||
" es_if_exists='replace', \n",
|
||
" es_dropna=True,\n",
|
||
" es_refresh=True,\n",
|
||
" compression='gzip')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here we see that the `\"_id\"` field was used to index our data frame. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'_id'"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.index.index_field"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next, we can check which field from elasticsearch are available to our eland data frame. `columns` is available as a parameter when instantiating the data frame which allows one to choose only a subset of fields from your index to be included in the data frame. Since we didn't set this parameter, we have access to all fields."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode',\n",
|
||
" 'UnitPrice', 'Unnamed: 0'],\n",
|
||
" dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.columns"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now, let's see the data types of our fields. Running `df.dtypes`, we can see that elasticsearch field types are mapped to pandas field types."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Country object\n",
|
||
"CustomerID float64\n",
|
||
"Description object\n",
|
||
"InvoiceDate object\n",
|
||
"InvoiceNo object\n",
|
||
"Quantity int64\n",
|
||
"StockCode object\n",
|
||
"UnitPrice float64\n",
|
||
"Unnamed: 0 int64\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.dtypes"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We also offer a `.info_es()` data frame method that shows all info about the underlying index. It also contains information about operations being passed from data frame methods to elasticsearch. More on this later."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" _source es_dtype pd_dtype searchable aggregatable\n",
|
||
"Country True keyword object True True\n",
|
||
"CustomerID True double float64 True True\n",
|
||
"Description True keyword object True True\n",
|
||
"InvoiceDate True keyword object True True\n",
|
||
"InvoiceNo True keyword object True True\n",
|
||
"Quantity True long int64 True True\n",
|
||
"StockCode True keyword object True True\n",
|
||
"UnitPrice True double float64 True True\n",
|
||
"Unnamed: 0 True long int64 True True\n",
|
||
" date_fields_format: {}\n",
|
||
"Operations:\n",
|
||
" tasks: []\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: None\n",
|
||
" body: {}\n",
|
||
" post_processing: []\n",
|
||
"'field_to_display_names': {}\n",
|
||
"'display_to_field_names': {}\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Selecting and Indexing Data\n",
|
||
"\n",
|
||
"Now that we understand how to create a data frame and get access to it's underlying attributes, let's see how we can select subsets of our data."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### head and tail\n",
|
||
"\n",
|
||
"much like pandas, eland data frames offer `.head(n)` and `.tail(n)` methods that return the first and last n rows, respectively."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1001</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... UnitPrice Unnamed: 0\n",
|
||
"1000 United Kingdom 14729.0 ... 1.25 1000\n",
|
||
"1001 United Kingdom 14729.0 ... 1.25 1001\n",
|
||
"\n",
|
||
"[2 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.head(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" _source es_dtype pd_dtype searchable aggregatable\n",
|
||
"Country True keyword object True True\n",
|
||
"CustomerID True double float64 True True\n",
|
||
"Description True keyword object True True\n",
|
||
"InvoiceDate True keyword object True True\n",
|
||
"InvoiceNo True keyword object True True\n",
|
||
"Quantity True long int64 True True\n",
|
||
"StockCode True keyword object True True\n",
|
||
"UnitPrice True double float64 True True\n",
|
||
"Unnamed: 0 True long int64 True True\n",
|
||
" date_fields_format: {}\n",
|
||
"Operations:\n",
|
||
" tasks: [('tail': ('sort_field': '_doc', 'count': 2)), ('head': ('sort_field': '_doc', 'count': 2)), ('tail': ('sort_field': '_doc', 'count': 2))]\n",
|
||
" size: 2\n",
|
||
" sort_params: _doc:desc\n",
|
||
" _source: None\n",
|
||
" body: {}\n",
|
||
" post_processing: [('sort_index'), ('head': ('count': 2)), ('tail': ('count': 2))]\n",
|
||
"'field_to_display_names': {}\n",
|
||
"'display_to_field_names': {}\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.tail(2).head(2).tail(2).info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>14998</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>14998</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14999</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>2.10</td>\n",
|
||
" <td>14999</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... UnitPrice Unnamed: 0\n",
|
||
"14998 United Kingdom 17419.0 ... 1.25 14998\n",
|
||
"14999 United Kingdom 17419.0 ... 2.10 14999\n",
|
||
"\n",
|
||
"[2 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.tail(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### selecting columns\n",
|
||
"\n",
|
||
"you can also pass a list of columns to select columns from the data frame in a specified order."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>InvoiceDate</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 2 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country InvoiceDate\n",
|
||
"1000 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1001 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1002 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1003 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1004 United Kingdom 2010-12-01 12:43:00\n",
|
||
"\n",
|
||
"[5 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Country', 'InvoiceDate']].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Boolean Indexing\n",
|
||
"\n",
|
||
"we also allow you to filter the data frame using boolean indexing. Under the hood, a boolean index maps to a `terms` query that is then passed to elasticsearch to filter the index."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'term': {'Country': 'Germany'}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1109</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>2.95</td>\n",
|
||
" <td>1109</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1110</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" <td>1110</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1111</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" <td>1111</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1112</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.65</td>\n",
|
||
" <td>1112</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1113</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.95</td>\n",
|
||
" <td>1113</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... UnitPrice Unnamed: 0\n",
|
||
"1109 Germany 12662.0 ... 2.95 1109\n",
|
||
"1110 Germany 12662.0 ... 2.55 1110\n",
|
||
"1111 Germany 12662.0 ... 0.85 1111\n",
|
||
"1112 Germany 12662.0 ... 1.65 1112\n",
|
||
"1113 Germany 12662.0 ... 1.95 1113\n",
|
||
"\n",
|
||
"[5 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# the construction of a boolean vector maps directly to an elasticsearch query\n",
|
||
"print(df['Country']=='Germany')\n",
|
||
"df[(df['Country']=='Germany')].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"we can also filter the data frame using a list of values."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'terms': {'Country': ['Germany', 'United States']}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1001</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1002</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1003</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0.29</td>\n",
|
||
" <td>1004</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... UnitPrice Unnamed: 0\n",
|
||
"1000 United Kingdom 14729.0 ... 1.25 1000\n",
|
||
"1001 United Kingdom 14729.0 ... 1.25 1001\n",
|
||
"1002 United Kingdom 14729.0 ... 1.25 1002\n",
|
||
"1003 United Kingdom 14729.0 ... 1.25 1003\n",
|
||
"1004 United Kingdom 14729.0 ... 0.29 1004\n",
|
||
"\n",
|
||
"[5 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df['Country'].isin(['Germany', 'United States']))\n",
|
||
"df[df['Country'].isin(['Germany', 'United Kingdom'])].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can also combine boolean vectors to further filter the data frame."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>0 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
"Empty DataFrame\n",
|
||
"Columns: [Country, CustomerID, Description, InvoiceDate, InvoiceNo, Quantity, StockCode, UnitPrice, Unnamed: 0]\n",
|
||
"Index: []\n",
|
||
"\n",
|
||
"[0 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Country']=='Germany') & (df['Quantity']>90)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Using this example, let see how eland translates this boolean filter to an elasticsearch `bool` query."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" _source es_dtype pd_dtype searchable aggregatable\n",
|
||
"Country True keyword object True True\n",
|
||
"CustomerID True double float64 True True\n",
|
||
"Description True keyword object True True\n",
|
||
"InvoiceDate True keyword object True True\n",
|
||
"InvoiceNo True keyword object True True\n",
|
||
"Quantity True long int64 True True\n",
|
||
"StockCode True keyword object True True\n",
|
||
"UnitPrice True double float64 True True\n",
|
||
"Unnamed: 0 True long int64 True True\n",
|
||
" date_fields_format: {}\n",
|
||
"Operations:\n",
|
||
" tasks: [('boolean_filter': ('boolean_filter': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}))]\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: None\n",
|
||
" body: {'query': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}}\n",
|
||
" post_processing: []\n",
|
||
"'field_to_display_names': {}\n",
|
||
"'display_to_field_names': {}\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df[(df['Country']=='Germany') & (df['Quantity']>90)].info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Aggregation and Descriptive Statistics\n",
|
||
"\n",
|
||
"Let's begin to ask some questions of our data and use eland to get the answers."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**How many different countries are there?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"16"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'].nunique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**What is the total sum of products ordered?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"111960.0"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].sum()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me the sum, mean, min, and max of the qunatity and unit_price fields**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>sum</th>\n",
|
||
" <td>111960.000</td>\n",
|
||
" <td>61548.490000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>7.464</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>2880.000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>-9360.000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Quantity UnitPrice\n",
|
||
"sum 111960.000 61548.490000\n",
|
||
"mean 7.464 4.103233\n",
|
||
"max 2880.000 950.990000\n",
|
||
"min -9360.000 0.000000"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Quantity','UnitPrice']].agg(['sum', 'mean', 'max', 'min'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Give me descriptive statistics for the entire data frame**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>10729.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>15590.776680</td>\n",
|
||
" <td>7.464000</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" <td>7499.500000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>1764.025160</td>\n",
|
||
" <td>85.924387</td>\n",
|
||
" <td>20.104873</td>\n",
|
||
" <td>4330.127009</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>12347.000000</td>\n",
|
||
" <td>-9360.000000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>14230.447617</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>1.250445</td>\n",
|
||
" <td>3756.500000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>15662.447503</td>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>2.510000</td>\n",
|
||
" <td>7495.114387</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>17213.464194</td>\n",
|
||
" <td>6.608187</td>\n",
|
||
" <td>4.210000</td>\n",
|
||
" <td>11249.500000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>18239.000000</td>\n",
|
||
" <td>2880.000000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" <td>14999.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" CustomerID Quantity UnitPrice Unnamed: 0\n",
|
||
"count 10729.000000 15000.000000 15000.000000 15000.000000\n",
|
||
"mean 15590.776680 7.464000 4.103233 7499.500000\n",
|
||
"std 1764.025160 85.924387 20.104873 4330.127009\n",
|
||
"min 12347.000000 -9360.000000 0.000000 0.000000\n",
|
||
"25% 14230.447617 1.000000 1.250445 3756.500000\n",
|
||
"50% 15662.447503 2.000000 2.510000 7495.114387\n",
|
||
"75% 17213.464194 6.608187 4.210000 11249.500000\n",
|
||
"max 18239.000000 2880.000000 950.990000 14999.000000"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# NBVAL_IGNORE_OUTPUT\n",
|
||
"df.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me a histogram of numeric columns**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtUAAAEICAYAAACQ+wgHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO3df7RfdX3n++er4A9ES/jRHjEwDV1k7EIZLZMF9Dp37hmxENRpmLuUoUPH4KUrt2thqy13WrBzL63KWtiRoUirnUyhjTYjUKoNt1o1Rc/qdGZARa0I6CWFIEkDKAnUSKXGed8/vp+jX2JOzo/9zTnnu/N8rHXW2fuzP3t/P+/s73efd/b3sz+fVBWSJEmSFu6HlroBkiRJ0rgzqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKploYk2Zvkx5e6HZIkSPJ7Sf7vER7v4iSfHNXxpGEm1VpySS5Jck+Sp5M8muR9SY5ZhNedSvLzw2VV9cKqerBt/8Mk7zrU7ZCkvkpSSU7dr+w3kvzRXPavql+oqne2/SaT7DjAsb7Tbog8meS/J/mpgxxvc1Wdu5BYpNmYVGtJJbkceDfw74BjgLOBVcAnkzxnCZsmSRoPt1TVC4EfAf4K+HCS7F8pyZGL3jIdVkyqtWSS/DDwm8AvVtXHq+o7VbUduBD4ceDf7H+3eP87FUmuSPI3Sb6Z5L4k/2po2yVJ/irJe5LsSfJQkvPbtquB/xX4nXaH43daeSU5NckG4GLgV9v2/zfJv0vyJ/vF8N4k1x+qfyNJ6rPpa3qSy5M8nmRXkjcPbf/DJO9KcjTw58BL2jV5b5KXDB+rqr4DbAJeDBzf/gb8tyTXJXkC+I3pvwtDx39Zkq1Jdid5LMnbW/kPDf19eSLJrUmOW4x/E40vk2otpf8FeD7w4eHCqtoLfAyYy1d0f8MgOT6GQYL+R0lOHNp+FvBV4ATgt4Abk6Sqfh34r8BbWpePt+zXho3AZuC32vZ/CfwRsDbJCvjeXY+LgA/ML2xJ0pAXM7iGrwQuBX43ybHDFarqW8D5wN+2a/ILq+pvh+skeR5wCfBIVX2jFZ8FPAhMAFfvV/9FwF8AHwdeApwK3NE2/yJwAfC/tW17gN8dRbDqL5NqLaUTgG9U1b4DbNvF4Ku8g6qqP66qv62q/1lVtwAPAGcOVXm4qv5zVX2XwR2MExlcXOetqnYBfwm8sRWtbe2/eyHHkyQB8B3gHe3byo8Be4GXzmP/C5M8CTwC/FPgXw1t+9uquqGq9lXV3++33+uBR6vq2qr6dlV9s6ruatt+Afj1qtpRVc8AvwG8wS4kOhiTai2lbwAnzHCROrFtP6gkb0ryxfaAypPAyxkk69MenV6oqqfb4gs7tHkT8HNt+eeAD3Y4liT13XeB/Z+PeQ6DRHraE/vdXHma+V2nb62qFVX1o1X16v1udDxykP1OZvBt54H8GPCRob8t9zOIZUE3ZXR4MKnWUvofwDPA/z5cmOSFDL7mmwK+BbxgaPOLh+r9GPCfgbcAx1fVCuDLwA88oDKDWsD2PwX+SZKXM7jLsXmOryVJh6OvMXj4fNgpwMMLONZs1+z57vMIg+d3Ztp2fkvWp3+eX1U7F9AGHSZMqrVkquopBv2gb0iyNslzkqwCbmVwl3oz8EXgtUmOS/Ji4G1DhziawQXz6wDt4ZaXz6MJjzHzBfWA26vq28BtwH8BPlNVX5vH60nS4eYW4N8nOak9/Pca4F8yuI7O12MMHkAc1ZCrfwacmORtSZ6X5EVJzmrbfg+4ut28IcmPJFk3otdVT5lUa0lV1W8BbwfeA3wTeIjBnenXtAdTPgj8NbAd+CSDC/T0vvcB1zK44/0YcDrw3+bx8tcz6CO3J8l7D7D9RuC09vXfnw6Vb2qvZdcPSTq4dwD/ncFQd3sYPDB+cVV9eb4HqqqvAB8CHmzX5ZfMts8sx/sm8NMMkvxHGTyT8y/a5uuB2xkM7/pN4E4GDz1KM0rVQr5NkQ6Ndrf5HcCrlutd4CT/CPgK8OKq+rulbo8kSVp6PsWqZaWq/iDJPgbD7S27pDrJDwG/AtxsQi1JkqZ5p1qaozb5wGMMHrBZW1UHe6pckiQdRkyqJUmSpI58UFGSJEnqaFn3qT7hhBNq1apVS92MZ/nWt77F0UcfvdTNGLm+xgX9ja2vccHyi+3uu+/+RlXNOsOnFmYh1/rl9h4ZtT7H1+fYoN/x9Tk26H6tX9ZJ9apVq/jc5z631M14lqmpKSYnJ5e6GSPX17igv7H1NS5YfrElWchEFZqjhVzrl9t7ZNT6HF+fY4N+x9fn2KD7td7uH5IkSVJHJtWSJElSRybVkiRJUkcm1ZIkSVJHJtWSJElSRybVkiRJUkcm1ZIkSVJHJtWSJElSRybVkiRJUkfLekZFaRTu2fkUl1zx0YPW2X7N6xapNZJGwc+1pOXGO9WSJElSRybVkiRJUkcm1ZIkSVJHJtWSJElSRybVkiRJUkcm1ZIkSVJHc0qqk/xyknuTfDnJh5I8P8kpSe5Ksi3JLUme2+o+r61va9tXDR3nylb+1STnHZqQJEmSpMU1a1KdZCXwS8Caqno5cARwEfBu4LqqOhXYA1zadrkU2NPKr2v1SHJa2+9lwFrgfUmOGG04kiRJ0uKba/ePI4GjkhwJvADYBbwauK1t3wRc0JbXtXXa9nOSpJXfXFXPVNVDwDbgzO4hSJIkSUtr1hkVq2pnkvcAXwP+HvgkcDfwZFXta9V2ACvb8krgkbbvviRPAce38juHDj28z/ck2QBsAJiYmGBqamr+UR1Ce/fuXXZtGoW+xgUwcRRcfvq+g9YZx9j7fM76HJskqZ9mTaqTHMvgLvMpwJPAHzPovnFIVNVGYCPAmjVranJy8lC91IJMTU2x3No0Cn2NC+CGzVu49p6Dv9W3Xzy5OI0ZoT6fsz7HJknqp7l0/3gN8FBVfb2qvgN8GHgVsKJ1BwE4CdjZlncCJwO07ccATwyXH2AfSZIkaWzNJan+GnB2khe0vtHnAPcBnwbe0OqsB7a05dvbOm37p6qqWvlFbXSQU4DVwGdGE4YkSZK0dObSp/quJLcBnwf2AV9g0D3jo8DNSd7Vym5su9wIfDDJNmA3gxE/qKp7k9zKICHfB1xWVd8dcTySJEnSops1qQaoqquAq/YrfpADjN5RVd8G3jjDca4Grp5nGyVJkqRlzRkVJUmSpI5MqiVJkqSOTKolSZKkjkyqJUmSpI5MqiVJJPnlJPcm+XKSDyV5fpJTktyVZFuSW5I8t9V9Xlvf1ravGjrOla38q0nOW6p4JGmxmVRL0mEuyUrgl4A1VfVy4AgGw6G+G7iuqk4F9gCXtl0uBfa08utaPZKc1vZ7GYOZd9+X5IjFjEWSlopJtSQJBkOsHtVmwn0BsAt4NXBb274JuKAtr2vrtO3ntMnB1gE3V9UzVfUQsI0DDL0qSX00p3GqJUn9VVU7k7yHwQy6fw98ErgbeLKq9rVqO4CVbXkl8Ejbd1+Sp4DjW/mdQ4ce3udZkmwANgBMTEwwNTU1rzZPHAWXn77voHXme8zlZO/evWPd/oPpc2zQ7/j6HNsomFRL0mEuybEM7jKfAjwJ/DGD7huHTFVtZDA7L2vWrKnJycl57X/D5i1ce8/B/4Rtv3h+x1xOpqammO+/ybjoc2zQ7/j6HNso2P1DkvQa4KGq+npVfQf4MPAqYEXrDgJwErCzLe8ETgZo248BnhguP8A+ktRrJtWSpK8BZyd5QesbfQ5wH/Bp4A2tznpgS1u+va3Ttn+qqqqVX9RGBzkFWA18ZpFikKQlZfcPSTrMVdVdSW4DPg/sA77AoGvGR4Gbk7yrld3YdrkR+GCSbcBuBiN+UFX3JrmVQUK+D7isqr67qMFI0hIxqZYkUVVXAVftV/wgBxi9o6q+DbxxhuNcDVw98gZK0jJn9w9JkiSpI5NqSZIkqaNZk+okL03yxaGfv0vytiTHJdma5IH2+9hWP0ne26ap/VKSM4aOtb7VfyDJ+plfVZIkSRofsybVVfXVqnplVb0S+KfA08BHgCuAO6pqNXBHWwc4n8ET36sZDOz/foAkxzHor3cWgz56V00n4pIkSdI4m2/3j3OAv6mqh3n2NLX7T1/7gRq4k8E4pycC5wFbq2p3Ve0BtnKIJxeQJEmSFsN8k+qLgA+15Ymq2tWWHwUm2vL3pq9tpqepnalckiRJGmtzHlIvyXOBnwGu3H9bVVWSGkWDkmxg0G2EiYmJZTfHfF/nve9rXAATR8Hlp+87aJ1xjL3P56zPsUmS+mk+41SfD3y+qh5r648lObGqdrXuHY+38pmmqd0JTO5XPrX/i1TVRgaTDrBmzZpabnPM93Xe+77GBXDD5i1ce8/B3+rbL55cnMaMUJ/PWZ9jkyT103y6f/ws3+/6Ac+epnb/6Wvf1EYBORt4qnUT+QRwbpJj2wOK57YySZIkaazN6U51kqOBnwb+z6Hia4Bbk1wKPAxc2Mo/BrwW2MZgpJA3A1TV7iTvBD7b6r2jqnZ3jkCSJElaYnNKqqvqW8Dx+5U9wWA0kP3rFnDZDMe5Cbhp/s2UJEmSli9nVJQkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6mlNSnWRFktuSfCXJ/Ul+KslxSbYmeaD9PrbVTZL3JtmW5EtJzhg6zvpW/4Ek6w9VUJIkSdJimuud6uuBj1fVTwCvAO4HrgDuqKrVwB1tHeB8YHX72QC8HyDJccBVwFnAmcBV04m4JEmSNM5mTaqTHAP8c+BGgKr6h6p6ElgHbGrVNgEXtOV1wAdq4E5gRZITgfOArVW1u6r2AFuBtSONRpIkSVoCR86hzinA14E/SPIK4G7grcBEVe1qdR4FJtrySuCRof13tLKZyp8lyQYGd7iZmJhgampqrrEsir179y67No1CX+MCmDgKLj9930HrjGPsfT5nfY5NktRPc0mqjwTOAH6xqu5Kcj3f7+oBQFVVkhpFg6pqI7ARYM2aNTU5OTmKw47M1NQUy61No9DXuABu2LyFa+85+Ft9+8WTi9OYEerzOetzbJKkfppLn+odwI6ququt38YgyX6sdeug/X68bd8JnDy0/0mtbKZySZIkaazNmlRX1aPAI0le2orOAe4DbgemR/BYD2xpy7cDb2qjgJwNPNW6iXwCODfJse0BxXNbmSRJkjTW5tL9A+AXgc1Jngs8CLyZQUJ+a5JLgYeBC1vdjwGvBbYBT7e6VNXuJO8EPtvqvaOqdo8kCkmSJGkJzSmprqovAmsOsOmcA9Qt4LIZjnMTcNN8GihJkiQtd86oKEmSJHVkUi1JcuZcSerIpFqSBM6cK0mdmFRL0mHOmXMlqbu5jv4hSeqvRZ05F7rPntvXmVKn9XlW0T7HBv2Or8+xjYJJtSRpUWfObcfrNHtuX2dKndbnWUX7HBv0O74+xzYKdv+QJDlzriR1ZFItSYc5Z86VpO7s/iFJAmfOlaROTKolSc6cK0kd2f1DkiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqaE5JdZLtSe5J8sUkn2tlxyXZmuSB9vvYVp4k702yLcmXkpwxdJz1rf4DSdbP9HqSJEnSOJnPnep/UVWvrKrpcUyvAO6oqtXAHW0d4HxgdfvZALwfBkk4cBVwFnAmcNV0Ii5JkiSNsy7dP9YBm9ryJuCCofIP1MCdwIokJwLnAVurandV7QG2Ams7vL4kSZK0LMx1RsUCPpmkgP9UVRuBiara1bY/Cky05ZXAI0P77mhlM5U/S5INDO5wMzExwdTU1BybuDj27t277No0Cn2NC2DiKLj89H0HrTOOsff5nPU5NklSP801qf5nVbUzyY8CW5N8ZXhjVVVLuDtrCftGgDVr1tTk5OQoDjsyU1NTLLc2jUJf4wK4YfMWrr3n4G/17RdPLk5jRqjP56zPsUmS+mlO3T+qamf7/TjwEQZ9oh9r3Tpovx9v1XcCJw/tflIrm6lckiRJGmuzJtVJjk7youll4Fzgy8DtwPQIHuuBLW35duBNbRSQs4GnWjeRTwDnJjm2PaB4biuTJEmSxtpcun9MAB9JMl3/v1TVx5N8Frg1yaXAw8CFrf7HgNcC24CngTcDVNXuJO8EPtvqvaOqdo8sEkmSJGmJzJpUV9WDwCsOUP4EcM4Bygu4bIZj3QTcNP9mSpIkScuXMypKkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdzTmpTnJEki8k+bO2fkqSu5JsS3JLkue28ue19W1t+6qhY1zZyr+a5LxRByNJkiQthfncqX4rcP/Q+ruB66rqVGAPcGkrvxTY08qva/VIchpwEfAyYC3wviRHdGu+JEmStPTmlFQnOQl4HfD7bT3Aq4HbWpVNwAVteV1bp20/p9VfB9xcVc9U1UPANuDMUQQhSZIkLaUj51jvt4FfBV7U1o8HnqyqfW19B7CyLa8EHgGoqn1Jnmr1VwJ3Dh1zeJ/vSbIB2AAwMTHB1NTUXGNZFHv37l12bRqFvsYFMHEUXH76voPWGcfY+3zO+hybJKmfZk2qk7weeLyq7k4yeagbVFUbgY0Aa9asqcnJQ/6S8zI1NcVya9Mo9DUugBs2b+Haew7+Vt9+8eTiNGaE+nzO+hybJKmf5nKn+lXAzyR5LfB84IeB64EVSY5sd6tPAna2+juBk4EdSY4EjgGeGCqfNryPJEmSNLZm7VNdVVdW1UlVtYrBg4afqqqLgU8Db2jV1gNb2vLtbZ22/VNVVa38ojY6yCnAauAzI4tEkiRJWiJdxqn+NeBXkmxj0Gf6xlZ+I3B8K/8V4AqAqroXuBW4D/g4cFlVfbfD60uSRsihUyVp4eb6oCIAVTUFTLXlBznA6B1V9W3gjTPsfzVw9XwbKUlaFNNDp/5wW58eOvXmJL/HYMjU9zM0dGqSi1q9f73f0KkvAf4iyT/2Boqkw4EzKkqSHDpVkjqa151qSVJvLdrQqdB9+NS+DpU5rc/DSvY5Nuh3fH2ObRRMqiXpMLfYQ6dC9+FT+zpU5rQ+DyvZ59ig3/H1ObZRMKmWJDl0qiR1ZJ9qSTrMOXSqJHXnnWpJ0kx+Dbg5ybuAL/DsoVM/2IZO3c0gEaeq7k0yPXTqPhw6VdJhxKRakvQ9Dp0qSQtj9w9JkiSpI5NqSZIkqSOTakmSJKkj+1RLknpp1RUfnbXO9mtetwgtkXQ48E61JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktTRrEl1kucn+UySv05yb5LfbOWnJLkrybYktyR5bit/Xlvf1ravGjrWla38q0nOO1RBSZIkSYtpLneqnwFeXVWvAF4JrE1yNvBu4LqqOhXYA1za6l8K7Gnl17V6JDkNuAh4GbAWeF+SI0YZjCRJkrQUZk2qa2BvW31O+yng1cBtrXwTcEFbXtfWadvPSZJWfnNVPVNVDwHbgDNHEoUkSZK0hOY0+Uu7o3w3cCrwu8DfAE9W1b5WZQewsi2vBB4BqKp9SZ4Cjm/ldw4ddnif4dfaAGwAmJiYYGpqan4RHWJ79+5ddm0ahb7GBTBxFFx++r6D1hnH2Pt8zvocmySpn+aUVFfVd4FXJlkBfAT4iUPVoKraCGwEWLNmTU1OTh6ql1qQqakpllubRqGvcQHcsHkL195z8Lf69osnF6cxI9Tnc9bn2CRJ/TSv0T+q6kng08BPASuSTGcqJwE72/JO4GSAtv0Y4Inh8gPsI0mSJI2tuYz+8SPtDjVJjgJ+GrifQXL9hlZtPbClLd/e1mnbP1VV1covaqODnAKsBj4zqkAkSZKkpTKX7h8nAptav+ofAm6tqj9Lch9wc5J3AV8Abmz1bwQ+mGQbsJvBiB9U1b1JbgXuA/YBl7VuJZIkSdJYmzWprqovAT95gPIHOcDoHVX1beCNMxzrauDq+TdTkiRJWr6cUVGSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6sikWpIkSepoTtOUS3236oqPzlpn+zWvW4SWSJKkceSdakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpo1mT6iQnJ/l0kvuS3Jvkra38uCRbkzzQfh/bypPkvUm2JflSkjOGjrW+1X8gyfpDF5YkSZK0eOZyp3ofcHlVnQacDVyW5DTgCuCOqloN3NHWAc4HVrefDcD7YZCEA1cBZwFnAldNJ+KSJEnSOJs1qa6qXVX1+bb8TeB+YCWwDtjUqm0CLmjL64AP1MCdwIokJwLnAVurandV7QG2AmtHGo0kSZK0BObVpzrJKuAngbuAiara1TY9Cky05ZXAI0O77WhlM5VLkiRJY+3IuVZM8kLgT4C3VdXfJfnetqqqJDWKBiXZwKDbCBMTE0xNTY3isCOzd+/eZdemUehrXAATR8Hlp+/rfJzl9u/T53PW59iWoyQnAx9gcHOkgI1VdX3rtncLsArYDlxYVXsy+ANwPfBa4GngkulvNNvzMv++HfpdVbUJSToMzCmpTvIcBgn15qr6cCt+LMmJVbWrde94vJXvBE4e2v2kVrYTmNyvfGr/16qqjcBGgDVr1tTk5OT+VZbU1NQUy61No9DXuABu2LyFa++Z8/8fZ7T94snujRmhPp+zPse2TE0/O/P5JC8C7k6yFbiEwbMz1yS5gsGzM7/Gs5+dOYvBszNnDT07s4ZBcn53kttblz9J6rW5jP4R4Ebg/qr6j0ObbgemR/BYD2wZKn9TGwXkbOCp1k3kE8C5SY5tDyie28okSUvIZ2ckqbu53L57FfBvgXuSfLGVvR24Brg1yaXAw8CFbdvHGHwluI3B14JvBqiq3UneCXy21XtHVe0eSRSSpJFYrGdnunb162u3rml97gLV59ig3/H1ObZRmDWprqq/AjLD5nMOUL+Ay2Y41k3ATfNpoCRpcSzWszPteJ26+vW1W9e0PneB6nNs0O/4+hzbKDijoiTpoM/OtO1zfXbmQOWS1Hsm1ZJ0mPPZGUnqrvt3Z5IAWHXFR2ets/2a1y1CS6R589kZSerIpFqSDnM+OyNJ3dn9Q5IkSerIpFqSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6mjWpDrJTUkeT/LlobLjkmxN8kD7fWwrT5L3JtmW5EtJzhjaZ32r/0CS9YcmHEmSJGnxzeVO9R8Ca/cruwK4o6pWA3e0dYDzgdXtZwPwfhgk4cBVwFnAmcBV04m4JEmSNO5mTaqr6i+B3fsVrwM2teVNwAVD5R+ogTuBFUlOBM4DtlbV7qraA2zlBxN1SZIkaSwducD9JqpqV1t+FJhoyyuBR4bq7WhlM5X/gCQbGNzlZmJigqmpqQU28dDYu3fvsmvTKPQ1LoCJo+Dy0/d1Ps5s/z5zeY1R/hv3+Zz1OTYtL6uu+OhBt2+/5nWL1BJJ426hSfX3VFUlqVE0ph1vI7ARYM2aNTU5OTmqQ4/E1NQUy61No9DXuABu2LyFa+/p/FZn+8WTB91+ySx/nOdyjPno8znrc2ySpH5a6Ogfj7VuHbTfj7fyncDJQ/VOamUzlUuSJEljb6FJ9e3A9Age64EtQ+VvaqOAnA081bqJfAI4N8mx7QHFc1uZJEmSNPZm/U48yYeASeCEJDsYjOJxDXBrkkuBh4ELW/WPAa8FtgFPA28GqKrdSd4JfLbVe0dV7f/woyRJkjSWZk2qq+pnZ9h0zgHqFnDZDMe5CbhpXq2TJEmSxoAzKkqSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkddZ8RQ9Ky5ExxkiQtHu9US5IkSR15p1qao9nu/EqSpMOXd6olSZKkjkyqJUmSpI5MqiVJkqSO7FMtSdIM5vIshSPpSALvVEuSJEmdmVRLkiRJHdn9Q5KkDuwiIgm8Uy1JkiR15p1qSb3hHUNJ0lJZ9KQ6yVrgeuAI4Per6prFboP6Yy5J1OWnL0JD5uhwTfoO17gPV17nf5CfAan/FjWpTnIE8LvATwM7gM8mub2q7lvMdkjL2VynQ7/89H1ccoinTjcR0Hx5nZd0uFrsO9VnAtuq6kGAJDcD6wAvtvoBc00utTDL7d93uD0z/YfBBH4seJ1foFF9Bkbx2fazJs1fqmrxXix5A7C2qn6+rf9b4KyqestQnQ3Ahrb6UuCri9bAuTkB+MZSN+IQ6Gtc0N/Y+hoXLL/YfqyqfmSpGzEO5nKdb+Vdr/XL7T0yan2Or8+xQb/j63NsAC+tqhctdOdl96BiVW0ENi51O2aS5HNVtWap2zFqfY0L+htbX+OCfsemga7X+r6/R/ocX59jg37H1+fYYBBfl/0Xe0i9ncDJQ+sntTJJUj94nZd0WFrspPqzwOokpyR5LnARcPsit0GSdOh4nZd0WFrU7h9VtS/JW4BPMBhq6aaquncx2zACy7ZrSkd9jQv6G1tf44J+x9Zri3id7/t7pM/x9Tk26Hd8fY4NOsa3qA8qSpIkSX3kNOWSJElSRybVkiRJUkcm1fOQ5PIkleSEtp4k702yLcmXkpyx1G2cryT/IclXWvs/kmTF0LYrW2xfTXLeUrZzIZKsbW3fluSKpW5PF0lOTvLpJPcluTfJW1v5cUm2Jnmg/T52qdu6EEmOSPKFJH/W1k9Jclc7d7e0B94kwM/2uOnz5zvJiiS3tb+j9yf5qZ6du19u78svJ/lQkueP8/lLclOSx5N8eajsgOdrITmeSfUcJTkZOBf42lDx+cDq9rMBeP8SNK2rrcDLq+qfAP8fcCVAktMYPLX/MmAt8L4Mph8eC/n+VMnnA6cBP9tiGlf7gLxZ1MMAAAQZSURBVMur6jTgbOCyFs8VwB1VtRq4o62Po7cC9w+tvxu4rqpOBfYAly5Jq7Ts+NkeS33+fF8PfLyqfgJ4BYM4e3HukqwEfglYU1UvZ/Dg8UWM9/n7QwY5zbCZzte8czyT6rm7DvhVYPjJznXAB2rgTmBFkhOXpHULVFWfrKp9bfVOBmPKwiC2m6vqmap6CNjGYPrhcfG9qZKr6h+A6amSx1JV7aqqz7flbzK4cK9kENOmVm0TcMHStHDhkpwEvA74/bYe4NXAba3KWMalQ8bP9hjp8+c7yTHAPwduBKiqf6iqJ+nJuWuOBI5KciTwAmAXY3z+quovgd37Fc90vuad45lUz0GSdcDOqvrr/TatBB4ZWt/RysbV/wH8eVse99jGvf0zSrIK+EngLmCiqna1TY8CE0vUrC5+m8F/WP9nWz8eeHLoP3u9OXcaCT/b46XPn+9TgK8Df9C6t/x+kqPpybmrqp3Aexh8Q78LeAq4m/6cv2kzna95X2tMqpskf9H6DO3/sw54O/D/LHUbF2qW2Kbr/DqDryE3L11LNZskLwT+BHhbVf3d8LYajI85VmNkJnk98HhV3b3UbZGWUt8+23BYfL6PBM4A3l9VPwl8i/26eozruQNofYvXMfjPw0uAo/nBrhO90vV8LerkL8tZVb3mQOVJTmfwhvrrwbdWnAR8PsmZjMl0vDPFNi3JJcDrgXPq+wOXj0VsBzHu7f8BSZ7D4I/u5qr6cCt+LMmJVbWrfS31+NK1cEFeBfxMktcCzwd+mEEfxRVJjmx3Q8b+3Gmk/GyPj75/vncAO6rqrrZ+G4Okug/nDuA1wENV9XWAJB9mcE77cv6mzXS+5n2t8U71LKrqnqr60apaVVWrGHyIzqiqRxlMvfum9oTo2cBTQ18hjIUkaxl8NfczVfX00KbbgYuSPC/JKQw66n9mKdq4QL2aKrn1Q7wRuL+q/uPQptuB9W15PbBlsdvWRVVdWVUntc/WRcCnqupi4NPAG1q1sYtLh5Sf7THR9893ywMeSfLSVnQOcB89OHfN14Czk7ygvU+n4+vF+Rsy0/mad47njIrzlGQ7gydhv9HeZL/D4OuQp4E3V9XnlrJ985VkG/A84IlWdGdV/ULb9usM+lnvY/CV5J8f+CjLU7s78tt8f6rkq5e4SQuW5J8B/xW4h+/3TXw7g76XtwL/CHgYuLCq9n8IYywkmQT+r6p6fZIfZ/AA2nHAF4Cfq6pnlrJ9Wj78bI+fvn6+k7ySwUOYzwUeBN7M4IZlL85dkt8E/jWDPOALwM8z6Fc8lucvyYeASeAE4DHgKuBPOcD5WkiOZ1ItSZIkdWT3D0mSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKmj/x/zpFkhZWjSNQAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAs8AAAEICAYAAACgdxkmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAeEElEQVR4nO3dfbRdd13n8feHPgA2mgLVUJpqiqmdqY0zyl0tjPOQKmhqG6ouhmmto2Vqs+qaOjqT0QngjOjYNdWB0ZYWWJHWgNaGTkVIaBDU8S58ACz1YVpaGWMJNqU0lIdICgLR7/xx9sXjJffefe85555z9n2/1upqzt777PP93X32737vd/9+e6eqkCRJkrS0p4w7AEmSJGlamDxLkiRJLZk8S5IkSS2ZPEuSJEktmTxLkiRJLZk8S5IkSS2ZPGtNSnIsyXPHHYckCZK8Icl/HeL+rkry7mHtT+pn8qxVk+TqJPcn+WySjyV5XZL1q/C5s0l+qH9ZVa2rqoeb9XuS/Oyo45CkrkpSSTbPW/aqJL/a5v1VdV1V/ffmfVuTHD7Bvr7YFD4+neQPk7xgkf3dUVXfsZK2SEsxedaqSLIT+Dngx4H1wPOBTcC7k5wyxtAkSdPhLVW1Dvhq4PeBtybJ/I2SnLzqkWlNMXnWyCX5KuCngR+pqt+sqi9W1SHgpcBzge+bX/2dX3lIsivJXyb5TJIHk3xP37qrk/x+klcn+VSSDye5pFl3A/AvgFuaisUtzfJKsjnJDuAq4Cea9fuT/HiSX5/XhpuT3DSqn5Ekddlcn55kZ5IjSR5L8rK+9XuS/GyS04B3As9p+uRjSZ7Tv6+q+iLwJuDZwLOa3wF/kOQXknwCeNXc74W+/X9jkt9K8skkjyd5RbP8KX2/Xz6R5K4kz1yNn4mml8mzVsM/A54GvLV/YVUdAw4AbS6t/SW9JHg9vUT8V5Oc2bf+IuBDwBnAzwO3JUlVvRL4PeD6ZqjG9fNi2A3cAfx8s3478KvAtiSnw5eqGFcAb15esyVJfZ5Nrw8/C7gGuDXJM/o3qKongUuAjzZ98rqq+mj/NkmeClwNPFJVTzSLLwIeBjYAN8zb/iuB3wZ+E3gOsBn4nWb1jwDfDfyrZt2ngFuH0Vh1l8mzVsMZwBNVdfwE6x6jdwluUVX1v6vqo1X1d1X1FuAvgAv7NvlIVf1SVf0tvYrEmfQ60WWrqseA9wD/ulm0rYn/vpXsT5IEwBeBn2muPh4AjgHnLeP9L03yaeAR4HnA9/St+2hVvbaqjlfV5+a97zLgY1X1mqr6m6r6TFW9v1l3HfDKqjpcVZ8HXgW8xKEfWozJs1bDE8AZC3RGZzbrF5XkB5L8aTNR5NPABfSS8jkfm/tHVX22+ee6AWJ+E/D9zb+/H/iVAfYlSV33t8D8+Sun0EuY53xiXhHlsyyvn76rqk6vqq+pqm+bV9B4ZJH3nU3v6uWJfB3wG32/Wx6i15YVFV+0Npg8azW8F/g88L39C5Oso3d5bhZ4EviKvtXP7tvu64BfAq4HnlVVpwMPAF82UWQBtYL1bwO+KckF9KoWd7T8LElai/6K3iTwfucAH1nBvpbqs5f7nkfoza9ZaN0lTVI+99/TqurRFcSgNcLkWSNXVUfpjVN+bZJtSU5Jsgm4i17V+Q7gT4HvSvLMJM8GfqxvF6fR6xg/DtBMMrlgGSE8zsId5wnXV9XfAHcDvwb8UVX91TI+T5LWmrcAP5lkYzMJ74XAdnr96HI9Tm8i4LBuZfoO4MwkP5bkqUm+MslFzbo3ADc0RRqSfHWSy4f0ueook2etiqr6eeAVwKuBzwAfpldpfmEzQeRXgD8DDgHvptcRz733QeA19CrYjwNbgD9YxsffRG8M26eS3HyC9bcB5zeX7d7Wt/xNzWc5ZEOSFvczwB/Su4Xcp+hN3L6qqh5Y7o6q6s+BO4GHm375OUu9Z4n9fQZ4Eb1k/mP05sxc3Ky+CdhH77apnwHeR2/yobSgVK3k6og0mKZ6/DPAt05qVTfJ1wJ/Djy7qv563PFIkqTxczapxqKqfjnJcXq3sZu45DnJU4D/BOw1cZYkSXOsPEvzNDfpf5zeRJdtVbXYLG5JkrSGmDxLkiRJLTlhUJIkSWppIsY8n3HGGbVp06Zxh/ElTz75JKeddtq4wxgJ2zZ9utoumLy23XfffU9U1ZJPvNTKrKSvn7TvyDB1uW3Q7fZ1uW3Q/fYN2tdPRPK8adMmPvCBD4w7jC+ZnZ1l69at4w5jJGzb9Olqu2Dy2pZkJQ90UEsr6esn7TsyTF1uG3S7fV1uG3S/fYP29Q7bkCRJkloaSfKc5LQkH0hy2Sj2L0mSJI1Dq+Q5ye1JjiR5YN7ybUk+lORgkl19q/4LvUcvS5IkSZ3RtvK8B9jWvyDJScCtwCXA+cCVSc5P8iLgQeDIEOOUJE0YrzJKWotaTRisqvck2TRv8YXAwap6GCDJXuByYB1wGr2E+nNJDlTV383fZ5IdwA6ADRs2MDs7u8ImDN+xY8cmKp5hsm3Tp6vtgm63bRoluR24DDhSVRf0Ld8G3AScBLyxqm5sVnmVUdKaM8jdNs4C+p+8dhi4qKquB0hyNfDEiRJngKraDewGmJmZqUma1dnlWaa2bfp0tV3Q7bZNqT3ALcCb5xb0XWV8Eb1+/t4k++j9DngQeNrqhylJ4zOyW9VV1Z5R7VuSNHyTeJWxy1cnutw26Hb7utw26H77BjVI8vwocHbf643NstaSbAe2b968eYAwJEkjNNarjF2+OtHltkG329fltkH32zeoQZLne4Fzk5xDL2m+Avi+5eygqvYD+2dmZq4dIA4JgPsfPcrVu+5ZcP2hGy9dxWiktaHNVcZBCiVLndfguS1pdbW9Vd2dwHuB85IcTnJNVR0HrgfeBTwE3FVVH1zOhyfZnmT30aNHlxu3JGl1DHyVsar2V9WO9evXDzUwSRqHtnfbuHKB5QeAAyv9cCvPkjTxBr7KKEld4uO5JUmAVxklqY2R3W2jDScMStLk8CqjJC1trJVnx8FJkiRpmjhsQ5I0Ug7bkNQlJs+SpJHyKqOkLhlr8mw1QpIkSdPEMc+SpJGyUCKpSxy2IUkaKQslkrrE5FmSJElqyTHPkiRJUkuOeZYkjZSFEkld4rANSdJIWSiR1CUmz5IkSVJLJs+SJElSS04YlCRJklpywqAkaaQslEjqEodtSJJGykKJpC45edwBSJI0iE277ll0/aEbL12lSCStBVaeJUmSpJZMniVJkqSWvNuGJEmS1JJ325AkjZSFEkld4rANSdJIWSiR1CUmz5IkSVJL3qpOa8ZSt7MCb2klSZIWZ+VZkiRJasnkWZIkSWrJ5FmSJElqyeRZkiRJasmHpEiSJEkt+ZAUSdJIWSiR1CUO25AkjZSFEkld4n2epWXyftGSJK1dVp4lSZKklkyeJUmSpJZMniVJkqSWTJ4lSZKklkyeJUmSpJZMniVJkqSWTJ4lSZKklkyeJUmSpJaGnjwn+cdJ3pDk7iQ/POz9S5IkSePSKnlOcnuSI0kemLd8W5IPJTmYZBdAVT1UVdcBLwW+dfghS5LGzUKJpLWq7eO59wC3AG+eW5DkJOBW4EXAYeDeJPuq6sEkLwZ+GPiV4YYrSRqVJLcDlwFHquqCvuXbgJuAk4A3VtWNVfUQcF2Sp9D73fD6ccTcxqZd9yy5zaEbL12FSCR1Qavkuarek2TTvMUXAger6mGAJHuBy4EHq2ofsC/JPcCvnWifSXYAOwA2bNjA7OzsSuIfiWPHjk1UPMPU5bZteDrs3HJ8oH20+dm0+Yxh/oy7fMy63LYptQcLJZK0qLaV5xM5C3ik7/Vh4KIkW4HvBZ4KHFjozVW1G9gNMDMzU1u3bh0glOGanZ1lkuIZpi637bV3vJ3X3D/IVxoOXbV1yW2ublPFarGftrp8zLrctmk0iYWSYfxR3MY4/ojr+h+PXW5fl9sG3W/foAbLNE6gqmaB2TbbJtkObN+8efOww5AkDcdYCyXD+KO4jWH+wdtW1/947HL7utw26H77BjVIj/QocHbf643Nstaqaj+wf2Zm5toB4pAkrTILJZLWqkFuVXcvcG6Sc5KcClwB7BtOWJKkCTGUQklV7Vi/fv1QA5OkcWh7q7o7gfcC5yU5nOSaqjoOXA+8C3gIuKuqPricD0+yPcnuo0ePLjduSdLqsFAiSX1aJc9VdWVVnVlVp1TVxqq6rVl+oKq+oaq+vqpuWO6HW42QpMlhoUSSljb6WRiSpKlQVVcusPwAi0wKbLFf57dI6oyhP557OaxGSJIkaZqMNXl22IYkdZ+FEkldMtbkWZLUfRZKJHXJWMc8e+9PaTCb2jzt8MZLVyESSZLWBodtSJJGymEbkrrEYRuSpJGyUCKpS7xVndSnzTAISZK0dnmrOkmSJKklxzxLkkbKQomkLnHMsyRppCyUSOoSk2dJkiSpJZNnSZIkqSUnDEqSJEktOWFQkjRSFkokdYn3eZYkjVRV7Qf2z8zMXDvuWBbio+4lteWYZ0mSJKklk2dJkiSpJZNnSZIkqSXvtiFJkiS15N02JEkjZaFEUpc4bEOSNFIWSiR1icmzJEmS1JLJsyRJktSSD0mRJKkFH6QiCaw8S5IkSa2ZPEuSJEktmTxLkiRJLfmQFEmSJKklH5IiSRopCyWSusRhG5KkkbJQIqlLTJ4lSZKklkyeJUmSpJZMniVJkqSWfMKgpKnjk94kSeNi5VmSJElqyeRZkiRJasnkWZIkSWrJMc+SJA2J4/Gl7rPyLEmSJLU0kspzku8GLgW+Critqt49is+RNFmWqrpZcZMkTbvWyXOS24HLgCNVdUHf8m3ATcBJwBur6saqehvwtiTPAF4NmDxrTfHSrbrOIomktWo5lec9wC3Am+cWJDkJuBV4EXAYuDfJvqp6sNnkJ5v10kDaJKM7t6xCIFKHWSSRpKW1Tp6r6j1JNs1bfCFwsKoeBkiyF7g8yUPAjcA7q+qPT7S/JDuAHQAbNmxgdnZ22cGPyrFjxyYqnmGa1rbt3HJ8yW02PL3ddpOi7XFY7Ji1ae9qHu+l4pkfy0q/j5PW7g7Zg0USSVrUoGOezwIe6Xt9GLgI+BHghcD6JJur6g3z31hVu4HdADMzM7V169YBQxme2dlZJimeYZrWtl3dqvJ8nNfcPz03kDl01dZW2y12zNr8XNp+zjAsFc/8WFb6fZy0dnfFsIskzfYDFUqm7Y/iNuZ+BtNazGiry+3rctug++0b1Egyjaq6Gbh5qe2SbAe2b968eRRhSFNv06572LnleKtkURqRFRdJYPBCyWvvePtU/VHcxtwfdtNazGiry+3rctug++0b1KC3qnsUOLvv9cZmWStVtb+qdqxfv37AMCRJq6mqbq6q51XVdQslznOSbE+y++jRo6sVniSNzKDJ873AuUnOSXIqcAWwb/CwJEkTYqAiCVgokdQtrZPnJHcC7wXOS3I4yTVVdRy4HngX8BBwV1V9cBn7tBohSZPNIokk9WmdPFfVlVV1ZlWdUlUbq+q2ZvmBqvqGqvr6qrphOR9uNUKSJscoiiTNfi2USOqMbs3CkCStWFVducDyA8CBAfa7H9g/MzNz7Ur3IUmTYtAxzwOxGiFJkqRpMtbk2WEbktR9FkokdYnDNqQxafPIcakLHLYhqUvGmjz7kBRpMrRJ5A/deOkqRCJJ0mRz2IYkaaQctiGpS8aaPEuSus9CiaQuMXmWJEmSWvJWdZIkSVJLjnmWJI2UhRJJXeKwDUnSSFkokdQl3udZkqRVNHdryJ1bjnP1AreJ9NaQ0uSy8ixJkiS15IRBSZIkqSUnDEqSRspCiaQucdiGJGmkLJRI6hKTZ0mSJKkl77YhadVsmndngRPdbcC7DEhLm38unYjnkjQaVp4lSZKklqw8a+zaVFAkSZImgbeqkySNlH29pC7xVnWSpJGyr5fUJY55liRJkloyeZYkSZJaMnmWJEmSWjJ5liRJkloyeZYkSZJaMnmWJEmSWvI+z5IkSVJL3udZkjRSFkokdYnDNiRJI2WhRFKXmDxLkiRJLZk8S5IkSS2ZPEuSJEktmTxLkiRJLZ087gAkSdLwbdp1z5LbHLrx0lWIROoWK8+SJElSSybPkiRJUksO25A6rs2lW0mS1I6VZ0mSJKmloVeekzwXeCWwvqpeMuz9S5Kk4XBSobR8rSrPSW5PciTJA/OWb0vyoSQHk+wCqKqHq+qaUQQrSZoMSZ6b5LYkd487FklaTW2HbewBtvUvSHIScCtwCXA+cGWS84canSRp1VgokaSltUqeq+o9wCfnLb4QONh0oF8A9gKXDzk+SdLq2YOFEklaVKqq3YbJJuAdVXVB8/olwLaq+qHm9b8FLgJ+CrgBeBHwxqr6HwvsbwewA2DDhg3P27t370ANGaZjx46xbt26cYcxEpPYtvsfPTqU/Wx4Ojz+uaHsaqJMSru2nLV+yW2WeyxP1LZhfU6b/cx38cUX31dVM8t+Y4ecoK9/AfCqqvrO5vXLAeb69iR3Lza/ZdC+/sgnj07E938UFju3l/r+DqvfbGMl5xJM5u+bYely26D77Ru0rx/6hMGq+gRwXYvtdgO7AWZmZmrr1q3DDmXFZmdnmaR4hmkS23b1kG6ltnPLcV5zf/fuvjgp7Tp01dYlt1nusTxR24b1OW32o1bOAh7pe30YuCjJs+gVSr45ycsXKpQM2te/9o63T8T3fxQWO7eX+v4Oq99sY6Xn0iT+vhmWLrcNut++QQ3SIz0KnN33emOzrLUk24HtmzdvHiAMjYuztNcW7xetfm0LJWBfP+3s66V/aJD7PN8LnJvknCSnAlcA+5azg6raX1U71q9f2SUhSdLIDVwosa+X1CWtKs9J7gS2AmckOQz8VFXdluR64F3AScDtVfXB5Xy41Yjus1opTb0vFUroJc1XAN833pAkaXza3m3jyqo6s6pOqaqNVXVbs/xAVX1DVX19Vd2w3A+3GiFJk6MplLwXOC/J4STXVNVxYK5Q8hBw10oKJUl2Hz26epPcJGlUujkLQ5K0bFV15QLLDwAHBtjvfmD/zMzMtSvdhyRNikHGPA/MaoQkSZKmyViTZ4dtSFL3WSiR1CVjTZ4lSd1noURSl5g8S5IkSS055lmSNFL29ZK6xDHPkqSRsq+X1CUO25AkSZJaMnmWJEmSWhrrQ1J8PPfytXnc9aEbL12FSCSpHft6SV3imGdJ0kjZ10vqEodtSJIkSS2ZPEuSJEktmTxLkiRJLTlhUJI0Uvb1GhYnzWsSOGFQkjRS9vWSusRhG5IkSVJLJs+SJElSSybPkiRJUksmz5IkSVJLU3u3jdWccevsXklaOe+2IalLvNuGJGmk7OsldYnDNiRJkqSWTJ4lSZKklkyeJUmSpJZMniVJkqSWTJ4lSZKklkyeJUmSpJZMniVJkqSWpvYhKZKk6WBf330nepjYzi3HubpZvpoPEvPBZho1H5IiSRop+3pJXeKwDUmSJKklk2dJkiSpJZNnSZIkqSWTZ0mSJKklk2dJkiSpJZNnSZIkqSWTZ0mSJKklk2dJkiSpJZNnSZIkqSWTZ0mSJKmlk4e9wySnAa8DvgDMVtUdw/4MSdJ42ddLWqtaVZ6T3J7kSJIH5i3fluRDSQ4m2dUs/l7g7qq6FnjxkOOVJI2Ifb0kLa3tsI09wLb+BUlOAm4FLgHOB65Mcj6wEXik2exvhxOmJGkV7MG+XpIWlapqt2GyCXhHVV3QvH4B8Kqq+s7m9cubTQ8Dn6qqdyTZW1VXLLC/HcAOgA0bNjxv7969ywr8/kePLrnNlrPWL2ufc44dO8a6detW5bOWq00si9nwdHj8c8OJd9BYhm2ubV3T1XbBidvW5rs5qnPy4osvvq+qZpb9xg6ZtL7+yCePrqnvf5cst33DOvdX47M2PB2+5pmr83t/WJbzs1vs2K1WvgNLx7zSWAbt6wcZ83wWf191gF5HehFwM3BLkkuB/Qu9uap2A7sBZmZmauvWrcv68Kt33bPkNoeuWt4+58zOztIfzyg/a7naxLKYnVuO85r7Tx5KvIPGMmxzbeuarrYLTty2Nt/NSTon14Cx9vWvvePta+r73yXLbd+wzv3V+KydW47z0mV+l8dtOT+7xY7davatS8U8rn5+6GdtVT0JvKzNtkm2A9s3b9487DAkSSNkXy9prRrkVnWPAmf3vd7YLGutqvZX1Y7166fr0ockrSH29ZLUZ5Dk+V7g3CTnJDkVuALYN5ywJEkTwr5ekvq0vVXdncB7gfOSHE5yTVUdB64H3gU8BNxVVR9czocn2Z5k99GjkzXxTJLWIvt6SVpaqzHPVXXlAssPAAdW+uFVtR/YPzMzc+1K9yFJGg77eklamo/nliRJkloaa/LspTxJ6j77ekldMtbk2RnYktR99vWSuqT1EwZHGkTyceAj446jzxnAE+MOYkRs2/Tpartg8tr2dVX11eMOoqtW2NdP2ndkmLrcNuh2+7rcNuh++86rqq9c6Zsn4tFGk/bLKskHuvqIXts2fbraLuh22/TlVtLXd/k70uW2Qbfb1+W2wdpo3yDvd8KgJEmS1JLJsyRJktSSyfOJ7R53ACNk26ZPV9sF3W6bhqPL35Eutw263b4utw1s36ImYsKgJEmSNA2sPEuSJEktmTxLkiRJLZk8n0CSnUkqyRnN6yS5OcnBJP83ybeMO8blSvI/k/x5E/9vJDm9b93Lm7Z9KMl3jjPOlUiyrYn9YJJd445nEEnOTvK7SR5M8sEkP9osf2aS30ryF83/nzHuWFciyUlJ/iTJO5rX5yR5f3Ps3pLk1HHHqMnQpfMaun9uQ7fP7ySnJ7m7+T36UJIXdOXYJfmPzXfygSR3JnnaNB+7JLcnOZLkgb5lJzxWK83vTJ7nSXI28B3AX/UtvgQ4t/lvB/D6MYQ2qN8CLqiqbwL+H/BygCTnA1cA3whsA16X5KSxRblMTay30jtG5wNXNm2aVseBnVV1PvB84N837dkF/E5VnQv8TvN6Gv0o8FDf658DfqGqNgOfAq4ZS1SaKB08r6H75zZ0+/y+CfjNqvpHwD+h186pP3ZJzgL+AzBTVRcAJ9HLCab52O2hl8/0W+hYrSi/M3n+cr8A/ATQP5PycuDN1fM+4PQkZ44luhWqqndX1fHm5fuAjc2/Lwf2VtXnq+rDwEHgwnHEuEIXAger6uGq+gKwl16bplJVPVZVf9z8+zP0Ouiz6LXpTc1mbwK+ezwRrlySjcClwBub1wG+Dbi72WQq26WR6NR5Dd0+t6Hb53eS9cC/BG4DqKovVNWn6cixo/fAvKcnORn4CuAxpvjYVdV7gE/OW7zQsVpRfmfy3CfJ5cCjVfVn81adBTzS9/pws2xa/Tvgnc2/p71t0x7/gpJsAr4ZeD+woaoea1Z9DNgwprAG8Yv0/jD9u+b1s4BP9/1R15ljp4F19ryGTp7b0O3z+xzg48AvN8NS3pjkNDpw7KrqUeDV9K62PwYcBe6jO8duzkLHakV9zZpLnpP8djOuZ/5/lwOvAP7buGNcqSXaNrfNK+ldPrxjfJFqKUnWAb8O/FhV/XX/uurdX3Kq7jGZ5DLgSFXdN+5YpHHq2rkNa+L8Phn4FuD1VfXNwJPMG6IxxcfuGfSqr+cAzwFO48uHPHTKMI7VyUOKZWpU1QtPtDzJFnpfnj/rXW1iI/DHSS4EHgXO7tt8Y7NsoizUtjlJrgYuA769/v4G31PRtkVMe/xfJskp9H653lFVb20WP57kzKp6rLmkdGR8Ea7ItwIvTvJdwNOAr6I3hvD0JCc3FY6pP3Yams6d19DZcxu6f34fBg5X1fub13fTS567cOxeCHy4qj4OkOSt9I5nV47dnIWO1Yr6mjVXeV5IVd1fVV9TVZuqahO9k+VbqupjwD7gB5pZmc8HjvaV/6dCkm30Lqm9uKo+27dqH3BFkqcmOYfeoPk/GkeMK3QvcG4zM/hUehMd9o05phVrxgneBjxUVf+rb9U+4Aebf/8g8PbVjm0QVfXyqtrYnFtXAP+nqq4Cfhd4SbPZ1LVLI9Op8xq6e25D98/vJg94JMl5zaJvBx6kA8eO3nCN5yf5iuY7Ote2Thy7PgsdqxXldz5hcAFJDtGbffpE84W6hd6ljM8CL6uqD4wzvuVKchB4KvCJZtH7quq6Zt0r6Y2DPk7vUuI7T7yXydRUO36R3izh26vqhjGHtGJJ/jnwe8D9/P3YwVfQGxt5F/C1wEeAl1bV/AkRUyHJVuA/V9VlSZ5LbzLYM4E/Ab6/qj4/zvg0Gbp0XsPaOLehu+d3kn9KbzLkqcDDwMvoFSCn/tgl+Wng39DLAf4E+CF6436n8tgluRPYCpwBPA78FPA2TnCsVprfmTxLkiRJLTlsQ5IkSWrJ5FmSJElqyeRZkiRJasnkWZIkSWrJ5FmSJElqyeRZkiRJasnkWZIkSWrp/wMSD7b5/v4ZRQAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30, log=True)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" <th>Unnamed: 0</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1228</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15485.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" <td>1228</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1237</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" <td>1237</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1286</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" <td>1286</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1293</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" <td>1293</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1333</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>18144.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.69</td>\n",
|
||
" <td>1333</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14784</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>10.95</td>\n",
|
||
" <td>14784</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14785</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.45</td>\n",
|
||
" <td>14785</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14788</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0.07</td>\n",
|
||
" <td>14788</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14974</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>0.72</td>\n",
|
||
" <td>14974</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14980</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" <td>14980</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>258 rows × 9 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... UnitPrice Unnamed: 0\n",
|
||
"1228 United Kingdom 15485.0 ... 2.55 1228\n",
|
||
"1237 Norway 12433.0 ... 1.06 1237\n",
|
||
"1286 Norway 12433.0 ... 1.25 1286\n",
|
||
"1293 Norway 12433.0 ... 0.85 1293\n",
|
||
"1333 United Kingdom 18144.0 ... 1.69 1333\n",
|
||
"... ... ... ... ... ...\n",
|
||
"14784 United Kingdom 15061.0 ... 10.95 14784\n",
|
||
"14785 United Kingdom 15061.0 ... 1.45 14785\n",
|
||
"14788 United Kingdom 15061.0 ... 0.07 14788\n",
|
||
"14974 United Kingdom 14739.0 ... 0.72 14974\n",
|
||
"14980 United Kingdom 14739.0 ... 1.06 14980\n",
|
||
"\n",
|
||
"[258 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.query('Quantity>50 & UnitPrice<100')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Arithmetic Operations"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Numeric values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1\n",
|
||
"1001 1\n",
|
||
"1002 1\n",
|
||
"1003 1\n",
|
||
"1004 12\n",
|
||
"Name: Quantity, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 0.29\n",
|
||
"Name: UnitPrice, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['UnitPrice'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"product = df['Quantity'] * df['UnitPrice']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 3.48\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"product.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"String concatenation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 United Kingdom21123\n",
|
||
"1001 United Kingdom21124\n",
|
||
"1002 United Kingdom21122\n",
|
||
"1003 United Kingdom84378\n",
|
||
"1004 United Kingdom21985\n",
|
||
" ... \n",
|
||
"14995 United Kingdom72349B\n",
|
||
"14996 United Kingdom72741\n",
|
||
"14997 United Kingdom22762\n",
|
||
"14998 United Kingdom21773\n",
|
||
"14999 United Kingdom22149\n",
|
||
"Length: 15000, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'] + df['StockCode']"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.7.5"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|