mirror of
https://github.com/elastic/eland.git
synced 2025-07-11 00:02:14 +08:00
* Improved read_csv docs + made 'to_eland' params consistent Note, will change API. * Removing additional args from pytest. doctests + nbval tests in the CI are not addressed by this PR.
1454 lines
64 KiB
Plaintext
1454 lines
64 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import eland as ed\n",
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"# Fix console size for consistent test results\n",
|
||
"from eland.conftest import *"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Online Retail Analysis"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Getting Started\n",
|
||
"\n",
|
||
"To get started, let's create an `eland.DataFrame` by reading a csv file. This creates and populates the \n",
|
||
"`online-retail` index in the local Elasticsearch cluster."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"df = ed.read_csv(\"data/online-retail.csv.gz\",\n",
|
||
" es_client='localhost', \n",
|
||
" es_dest_index='online-retail', \n",
|
||
" es_if_exists='replace', \n",
|
||
" es_dropna=True,\n",
|
||
" es_refresh=True,\n",
|
||
" compression='gzip',\n",
|
||
" index_col=0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here we see that the `\"_id\"` field was used to index our data frame. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'_id'"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.index.index_field"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next, we can check which field from elasticsearch are available to our eland data frame. `columns` is available as a parameter when instantiating the data frame which allows one to choose only a subset of fields from your index to be included in the data frame. Since we didn't set this parameter, we have access to all fields."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode',\n",
|
||
" 'UnitPrice'],\n",
|
||
" dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.columns"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now, let's see the data types of our fields. Running `df.dtypes`, we can see that elasticsearch field types are mapped to pandas field types."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Country object\n",
|
||
"CustomerID float64\n",
|
||
"Description object\n",
|
||
"InvoiceDate object\n",
|
||
"InvoiceNo object\n",
|
||
"Quantity int64\n",
|
||
"StockCode object\n",
|
||
"UnitPrice float64\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.dtypes"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We also offer a `.info_es()` data frame method that shows all info about the underlying index. It also contains information about operations being passed from data frame methods to elasticsearch. More on this later."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: []\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {}\n",
|
||
" post_processing: []\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Selecting and Indexing Data\n",
|
||
"\n",
|
||
"Now that we understand how to create a data frame and get access to it's underlying attributes, let's see how we can select subsets of our data."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### head and tail\n",
|
||
"\n",
|
||
"much like pandas, eland data frames offer `.head(n)` and `.tail(n)` methods that return the first and last n rows, respectively."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21123</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21124</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1000 United Kingdom 14729.0 ... 21123 1.25\n",
|
||
"1001 United Kingdom 14729.0 ... 21124 1.25\n",
|
||
"\n",
|
||
"[2 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.head(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: [('tail': ('sort_field': '_doc', 'count': 2)), ('head': ('sort_field': '_doc', 'count': 2)), ('tail': ('sort_field': '_doc', 'count': 2))]\n",
|
||
" size: 2\n",
|
||
" sort_params: _doc:desc\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {}\n",
|
||
" post_processing: [('sort_index'), ('head': ('count': 2)), ('tail': ('count': 2))]\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.tail(2).head(2).tail(2).info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>14998</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21773</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14999</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22149</td>\n",
|
||
" <td>2.10</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"14998 United Kingdom 17419.0 ... 21773 1.25\n",
|
||
"14999 United Kingdom 17419.0 ... 22149 2.10\n",
|
||
"\n",
|
||
"[2 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.tail(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### selecting columns\n",
|
||
"\n",
|
||
"you can also pass a list of columns to select columns from the data frame in a specified order."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>InvoiceDate</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 2 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country InvoiceDate\n",
|
||
"1000 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1001 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1002 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1003 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1004 United Kingdom 2010-12-01 12:43:00\n",
|
||
"\n",
|
||
"[5 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Country', 'InvoiceDate']].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Boolean Indexing\n",
|
||
"\n",
|
||
"we also allow you to filter the data frame using boolean indexing. Under the hood, a boolean index maps to a `terms` query that is then passed to elasticsearch to filter the index."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'term': {'Country': 'Germany'}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1109</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22809</td>\n",
|
||
" <td>2.95</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1110</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84347</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1111</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84945</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1112</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22242</td>\n",
|
||
" <td>1.65</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1113</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22244</td>\n",
|
||
" <td>1.95</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1109 Germany 12662.0 ... 22809 2.95\n",
|
||
"1110 Germany 12662.0 ... 84347 2.55\n",
|
||
"1111 Germany 12662.0 ... 84945 0.85\n",
|
||
"1112 Germany 12662.0 ... 22242 1.65\n",
|
||
"1113 Germany 12662.0 ... 22244 1.95\n",
|
||
"\n",
|
||
"[5 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# the construction of a boolean vector maps directly to an elasticsearch query\n",
|
||
"print(df['Country']=='Germany')\n",
|
||
"df[(df['Country']=='Germany')].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"we can also filter the data frame using a list of values."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'terms': {'Country': ['Germany', 'United States']}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21123</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21124</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21122</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84378</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21985</td>\n",
|
||
" <td>0.29</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1000 United Kingdom 14729.0 ... 21123 1.25\n",
|
||
"1001 United Kingdom 14729.0 ... 21124 1.25\n",
|
||
"1002 United Kingdom 14729.0 ... 21122 1.25\n",
|
||
"1003 United Kingdom 14729.0 ... 84378 1.25\n",
|
||
"1004 United Kingdom 14729.0 ... 21985 0.29\n",
|
||
"\n",
|
||
"[5 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df['Country'].isin(['Germany', 'United States']))\n",
|
||
"df[df['Country'].isin(['Germany', 'United Kingdom'])].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can also combine boolean vectors to further filter the data frame."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>0 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
"Empty DataFrame\n",
|
||
"Columns: [Country, CustomerID, Description, InvoiceDate, InvoiceNo, Quantity, StockCode, UnitPrice]\n",
|
||
"Index: []\n",
|
||
"\n",
|
||
"[0 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Country']=='Germany') & (df['Quantity']>90)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Using this example, let see how eland translates this boolean filter to an elasticsearch `bool` query."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: [('boolean_filter': ('boolean_filter': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}))]\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {'query': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}}\n",
|
||
" post_processing: []\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df[(df['Country']=='Germany') & (df['Quantity']>90)].info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Aggregation and Descriptive Statistics\n",
|
||
"\n",
|
||
"Let's begin to ask some questions of our data and use eland to get the answers."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**How many different countries are there?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"16"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'].nunique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**What is the total sum of products ordered?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"111960.0"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].sum()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me the sum, mean, min, and max of the qunatity and unit_price fields**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>sum</th>\n",
|
||
" <td>111960.000</td>\n",
|
||
" <td>61548.490000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>7.464</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>2880.000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>-9360.000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Quantity UnitPrice\n",
|
||
"sum 111960.000 61548.490000\n",
|
||
"mean 7.464 4.103233\n",
|
||
"max 2880.000 950.990000\n",
|
||
"min -9360.000 0.000000"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Quantity','UnitPrice']].agg(['sum', 'mean', 'max', 'min'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Give me descriptive statistics for the entire data frame**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>10729.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>15590.776680</td>\n",
|
||
" <td>7.464000</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>1764.025160</td>\n",
|
||
" <td>85.924387</td>\n",
|
||
" <td>20.104873</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>12347.000000</td>\n",
|
||
" <td>-9360.000000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>14215.123301</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>1.250100</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>15654.828552</td>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>2.510000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>17218.003301</td>\n",
|
||
" <td>6.570576</td>\n",
|
||
" <td>4.210000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>18239.000000</td>\n",
|
||
" <td>2880.000000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" CustomerID Quantity UnitPrice\n",
|
||
"count 10729.000000 15000.000000 15000.000000\n",
|
||
"mean 15590.776680 7.464000 4.103233\n",
|
||
"std 1764.025160 85.924387 20.104873\n",
|
||
"min 12347.000000 -9360.000000 0.000000\n",
|
||
"25% 14215.123301 1.000000 1.250100\n",
|
||
"50% 15654.828552 2.000000 2.510000\n",
|
||
"75% 17218.003301 6.570576 4.210000\n",
|
||
"max 18239.000000 2880.000000 950.990000"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# NBVAL_IGNORE_OUTPUT\n",
|
||
"df.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me a histogram of numeric columns**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtUAAAEICAYAAACQ+wgHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3df7RfdX3n++erIP5AS/jRHjEwTbrItQvNaJkswOvc3jNiIaDTMLOUoUPHwNCV6Vpotc2dFuzcS6uyFnZkKNhKb6bQBssIlGrJVKum6FkdZgZE1IqAXlIIkpQfSgI1UKlh3veP7+fI15iTc072yTnnu/N8rHXWd+/P/uz9/byzz/6ed/b3sz+fVBWSJEmS9t+PLHQDJEmSpFFnUi1JkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR1ZFItSZIkdWRSLQ1JsivJTy50OyRJkOT3k/zfc3i885J8dq6OJw0zqdaCS3J+knuSPJvksSQfSXLEPLzvRJJfHC6rqpdX1YNt+x8l+cCBbock9VWSSnLCHmW/meSPZ7J/Vf1SVb2/7TeeZNtejvW9dkPkqST/I8kb9nG8G6rq9P2JRZqOSbUWVJL1wAeBfw8cAZwKLAM+m+RFC9g0SdJouKmqXg78GHA78PEk2bNSkkPnvWU6qJhUa8Ek+VHgt4B3VdWnq+p7VbUVOAf4SeBf73m3eM87FUkuTvI3Sb6T5L4k/2Jo2/lJbk/yoSQ7kzyU5My27TLg/wB+t93h+N1WXklOSLIOOA/4tbb9vyb590n+dI8Yrk5y1YH6N5KkPpv8TE+yPskTSR5NcsHQ9j9K8oEkhwN/AbyqfSbvSvKq4WNV1feAjcArgaPb34D/nuTKJE8Cvzn5d2Ho+K9JsjnJjiSPJ3lvK/+Rob8vTya5OclR8/FvotFlUq2F9L8DLwE+PlxYVbuATwEz+Yrubxgkx0cwSND/OMmxQ9tPAb4BHAP8NnBtklTVbwD/DXhn6/Lxzj3asAG4Afjttv2fA38MrE6yBL5/1+Nc4PrZhS1JGvJKBp/hS4ELgd9LcuRwhap6BjgT+Nv2mfzyqvrb4TpJXgycDzxSVd9uxacADwJjwGV71H8F8JfAp4FXAScAt7XN7wLOBv7Ptm0n8HtzEaz6y6RaC+kY4NtVtXsv2x5l8FXePlXVn1TV31bV/6qqm4AHgJOHqjxcVf+5qp5ncAfjWAYfrrNWVY8CfwW8vRWtbu2/e3+OJ0kC4HvA+9q3lZ8CdgGvnsX+5yR5CngE+CfAvxja9rdV9eGq2l1Vf7/Hfm8FHquqK6rqu1X1naq6s237JeA3qmpbVT0H/CbwNruQaF9MqrWQvg0cM8WH1LFt+z4leUeSr7QHVJ4CXssgWZ/02ORCVT3bFl/eoc0bgV9oy78AfLTDsSSp754H9nw+5kUMEulJT+5xc+VZZvc5fXNVLamqH6+qN+1xo+ORfex3PINvO/fmJ4BPDP1tuZ9BLPt1U0YHB5NqLaT/CTwH/MvhwiQvZ/A13wTwDPCyoc2vHKr3E8B/Bt4JHF1VS4CvAT/0gMoUaj+2/xnwj5O8lsFdjhtm+F6SdDD6JoOHz4ctBx7ej2NN95k9230eYfD8zlTbzmzJ+uTPS6pq+360QQcJk2otmKp6mkE/6A8nWZ3kRUmWATczuEt9A/AV4KwkRyV5JfCeoUMczuAD81sA7eGW186iCY8z9QfqXrdX1XeBW4D/Anyhqr45i/eTpIPNTcB/SHJce/jvzcA/Z/A5OluPM3gAca6GXP1z4Ngk70ny4iSvSHJK2/b7wGXt5g1JfizJmjl6X/WUSbUWVFX9NvBe4EPAd4CHGNyZfnN7MOWjwF8DW4HPMviAntz3PuAKBne8HwdWAv99Fm9/FYM+cjuTXL2X7dcCJ7av//5sqHxjey+7fkjSvr0P+B8MhrrbyeCB8fOq6muzPVBVfR34GPBg+1x+1XT7THO87wA/yyDJf4zBMzn/rG2+CtjEYHjX7wB3MHjoUZpSqvbn2xTpwGh3m98HvHGx3gVO8o+ArwOvrKq/W+j2SJKkhedTrFpUquoPk+xmMNzeokuqk/wI8KvAjSbUkiRpkneqpRlqkw88zuABm9VVta+nyiVJ0kHEpFqSJEnqyAcVJUmSpI4WdZ/qY445ppYtW7bQzQDgmWee4fDDD1/oZhwQxjZ6+hoXLM7Y7r777m9X1bQzfGr/7M9n/WL8PZkrfY4N+h1fn2OD/sfX9bN+USfVy5Yt44tf/OJCNwOAiYkJxsfHF7oZB4SxjZ6+xgWLM7Yk+zNRhWZofz7rF+PvyVzpc2zQ7/j6HBv0P76un/V2/5AkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjpa1DMqSnNh2cWfnLbO1svfMg8tkTRX7tn+NOdPc217XUuaT96pliRJkjoyqZYkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOppRUp3kV5Lcm+RrST6W5CVJlie5M8mWJDclOazVfXFb39K2Lxs6ziWt/BtJzjgwIUmSJEnza9qkOslS4JeBVVX1WuAQ4Fzgg8CVVXUCsBO4sO1yIbCzlV/Z6pHkxLbfa4DVwEeSHDK34UiSJEnzb6bdPw4FXprkUOBlwKPAm4Bb2vaNwNlteU1bp20/LUla+Y1V9VxVPQRsAU7uHoIkSZK0sKadUbGqtif5EPBN4O+BzwJ3A09V1e5WbRuwtC0vBR5p++5O8jRwdCu/Y+jQw/t8X5J1wDqAsbExJiYmZh/VAbBr165F05a51vfY1q98ftp6oxZ/389ZX2OTJPXXtEl1kiMZ3GVeDjwF/AmD7hsHRFVtADYArFq1qsbHxw/UW83KxMQEi6Utc63vsV1x+zPT1tt63viBb8wc6vs562tskqT+mkn3jzcDD1XVt6rqe8DHgTcCS1p3EIDjgO1teTtwPEDbfgTw5HD5XvaRJEmSRtZMkupvAqcmeVnrG30acB/weeBtrc5a4Na2vKmt07Z/rqqqlZ/bRgdZDqwAvjA3YUiSJEkLZyZ9qu9McgvwJWA38GUG3TM+CdyY5AOt7Nq2y7XAR5NsAXYwGPGDqro3yc0MEvLdwEVVNX1nV0mSJGmRmzapBqiqS4FL9yh+kL2M3lFV3wXePsVxLgMum2UbJUmSpEXNGRUlSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjkyqJUmSpI5MqiVJJPmVJPcm+VqSjyV5SZLlSe5MsiXJTUkOa3Vf3Na3tO3Lho5zSSv/RpIzFioeSZpvJtWSdJBLshT4ZWBVVb0WOITBHAMfBK6sqhOAncCFbZcLgZ2t/MpWjyQntv1eA6wGPpLkkPmMRZIWikm1JAkG8xa8NMmhwMuAR4E3Abe07RuBs9vymrZO235am3F3DXBjVT1XVQ8BW9jLfAaS1EczmvxFktRfVbU9yYeAbwJ/D3wWuBt4qqp2t2rbgKVteSnwSNt3d5KngaNb+R1Dhx7e5wckWQesAxgbG2NiYmJWbR57KaxfuXufdWZ7zMVi165dI9v2mehzfH2ODfofX1cm1ZJ0kEtyJIO7zMuBp4A/YdB944Cpqg3ABoBVq1bV+Pj4rPb/8A23csU9+/4TtvW82R1zsZiYmGC2/x6jpM/x9Tk26H98Xdn9Q5L0ZuChqvpWVX0P+DjwRmBJ6w4CcBywvS1vB44HaNuPAJ4cLt/LPpLUaybVkqRvAqcmeVnrG30acB/weeBtrc5a4Na2vKmt07Z/rqqqlZ/bRgdZDqwAvjBPMUjSgrL7hyQd5KrqziS3AF8CdgNfZtA145PAjUk+0MqubbtcC3w0yRZgB4MRP6iqe5PczCAh3w1cVFXPz2swkrRATKolSVTVpcClexQ/yF5G76iq7wJvn+I4lwGXzXkDJWmRs/uHJEmS1NG0SXWSVyf5ytDP3yV5T5KjkmxO8kB7PbLVT5Kr24xaX01y0tCx1rb6DyRZO/W7SpIkSaNj2qS6qr5RVa+vqtcD/wR4FvgEcDFwW1WtAG5r6wBnMng4ZQWDMUivAUhyFIOvFk9h8HXipZOJuCRJkjTKZtv94zTgb6rqYX5wRq09Z9q6vgbuYDAk07HAGcDmqtpRVTuBzRzgcVAlSZKk+TDbpPpc4GNteayqHm3LjwFjbfn7M201kzNqTVUuSZIkjbQZj/6R5DDg54BL9txWVZWk5qJBXaeuPVD6PDVn32Nbv3L6Eb1GLf6+n7O+xiZJ6q/ZDKl3JvClqnq8rT+e5NiqerR173iilU81o9Z2YHyP8ok936Tr1LUHSp+n5ux7bFfc/sy09UZtOuO+n7O+xiZJ6q/ZdP/4eV7o+gE/OKPWnjNtvaONAnIq8HTrJvIZ4PQkR7YHFE9vZZIkSdJIm9Gd6iSHAz8L/Luh4suBm5NcCDwMnNPKPwWcBWxhMFLIBQBVtSPJ+4G7Wr33VdWOzhFIkiRJC2xGSXVVPQMcvUfZkwxGA9mzbgEXTXGc64DrZt9MSZIkafFyRkVJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpoxkl1UmWJLklydeT3J/kDUmOSrI5yQPt9chWN0muTrIlyVeTnDR0nLWt/gNJ1h6ooCRJkqT5NNM71VcBn66qnwJeB9wPXAzcVlUrgNvaOsCZwIr2sw64BiDJUcClwCnAycClk4m4JEmSNMqmTaqTHAH8DHAtQFX9Q1U9BawBNrZqG4Gz2/Ia4PoauANYkuRY4Axgc1XtqKqdwGZg9ZxGI0mSJC2AQ2dQZznwLeAPk7wOuBt4NzBWVY+2Oo8BY215KfDI0P7bWtlU5T8gyToGd7gZGxtjYmJiprEcULt27Vo0bZlrfY9t/crnp603avH3/Zz1NTZJUn/NJKk+FDgJeFdV3ZnkKl7o6gFAVVWSmosGVdUGYAPAqlWranx8fC4O29nExASLpS1zre+xXXH7M9PW23re+IFvzBzq+znra2ySpP6aSZ/qbcC2qrqzrd/CIMl+vHXroL0+0bZvB44f2v+4VjZVuSRJkjTSpk2qq+ox4JEkr25FpwH3AZuAyRE81gK3tuVNwDvaKCCnAk+3biKfAU5PcmR7QPH0ViZJkiSNtJl0/wB4F3BDksOAB4ELGCTkNye5EHgYOKfV/RRwFrAFeLbVpap2JHk/cFer976q2jEnUUiSJEkLaEZJdVV9BVi1l02n7aVuARdNcZzrgOtm00BJkiRpsXNGRUmSJKkjk2pJkiSpI5NqSRJJliS5JcnXk9yf5A1JjkqyOckD7fXIVjdJrk6yJclXk5w0dJy1rf4DSdZO/Y6S1C8m1ZIkgKuAT1fVTwGvA+5nMCfBbVW1AriNF+YoOBNY0X7WAdcAJDkKuBQ4BTgZuHQyEZekvjOplqSDXJIjgJ8BrgWoqn+oqqeANcDGVm0jcHZbXgNcXwN3AEvafAVnAJurakdV7QQ2A6vnMRRJWjAzHVJPktRfy4FvAX+Y5HXA3cC7gbE2zwDAY8BYW14KPDK0/7ZWNlX5D0myjsFdbsbGxmY9Nf3YS2H9yt37rDOq093v2rVrZNs+E32Or8+xQf/j68qkWpJ0KIOZct9VVXcmuYoXunoAg+FSk9RcvWFVbQA2AKxatapmOzX9h2+4lSvu2fefsK3nze6Yi8XExASz/fcYJX2Or8+xQf/j68ruH5KkbcC2qrqzrd/CIMl+vHXroL0+0bZvB44f2v+4VjZVuST1nkm1JB3kquox4JEkr25FpwH3AZuAyRE81gK3tuVNwDvaKCCnAk+3biKfAU5PcmR7QPH0ViZJvWf3D0kSwLuAG5IcBjwIXMDgxsvNSS4EHgbOaXU/BZwFbAGebXWpqh1J3g/c1eq9r6p2zF8IkrRwTKolSVTVV4BVe9l02l7qFnDRFMe5DrhublsnSYuf3T8kSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjkyqJUmSpI5mlFQn2ZrkniRfSfLFVnZUks1JHmivR7byJLk6yZYkX01y0tBx1rb6DyRZO9X7SZIkSaNkNneq/1lVvb6qJodcuhi4rapWALfxwpS2ZwIr2s864BoYJOHApcApwMnApZOJuCRJkjTKunT/WANsbMsbgbOHyq+vgTuAJW162zOAzVW1o6p2ApuB1R3eX5IkSVoUZjr5SwGfTVLA/1tVG4CxNi0twGPAWFteCjwytO+2VjZV+Q9Iso7BHW7GxsaYmJiYYRMPrF27di2atsy1vse2fuXz09Ybtfj7fs76Gpskqb9mmlT/06ranuTHgc1Jvj68saqqJdydtYR9A8CqVatqfHx8Lg7b2cTEBIulLXOt77Fdcfsz09bbet74gW/MHOr7OetrbJKk/ppR94+q2t5enwA+waBP9OOtWwft9YlWfTtw/NDux7WyqcolSZKkkTZtUp3k8CSvmFwGTge+BmwCJkfwWAvc2pY3Ae9oo4CcCjzduol8Bjg9yZHtAcXTW5kkSZI00mbS/WMM+ESSyfr/pao+neQu4OYkFwIPA+e0+p8CzgK2AM8CFwBU1Y4k7wfuavXeV1U75iwSSZIkaYFMm1RX1YPA6/ZS/iRw2l7KC7hoimNdB1w3+2ZKkiRJi5czKkqSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR3NOKlOckiSLyf587a+PMmdSbYkuSnJYa38xW19S9u+bOgYl7TybyQ5Y66DkSRJkhbCbO5Uvxu4f2j9g8CVVXUCsBO4sJVfCOxs5Ve2eiQ5ETgXeA2wGvhIkkO6NV+SJElaeDNKqpMcB7wF+IO2HuBNwC2tykbg7La8pq3Ttp/W6q8Bbqyq56rqIWALcPJcBCFJkiQtpENnWO93gF8DXtHWjwaeqqrdbX0bsLQtLwUeAaiq3UmebvWXAncMHXN4n+9Lsg5YBzA2NsbExMRMYzmgdu3atWjaMtf6Htv6lc9PW2/U4u/7OetrbJKk/po2qU7yVuCJqro7yfiBblBVbQA2AKxatarGxw/4W87IxMQEi6Utc63vsV1x+zPT1tt63viBb8wc6vs562tskqT+msmd6jcCP5fkLOAlwI8CVwFLkhza7lYfB2xv9bcDxwPbkhwKHAE8OVQ+aXgfSZIkaWRN26e6qi6pquOqahmDBw0/V1XnAZ8H3taqrQVubcub2jpt++eqqlr5uW10kOXACuALcxaJJEmStEC6jFP968CvJtnCoM/0ta38WuDoVv6rwMUAVXUvcDNwH/Bp4KKqmr6zqyRpXjh0qiTtv5k+qAhAVU0AE235QfYyekdVfRd4+xT7XwZcNttGSpLmxeTQqT/a1ieHTr0xye8zGDL1GoaGTk1ybqv3r/YYOvVVwF8m+d+8gSLpYOCMipIkh06VpI5mdadaktRb8zZ0KnQfPnXspbB+5e591hnVoRn7Pqxkn+Prc2zQ//i6MqmWpIPcfA+dCt2HT/3wDbdyxT37/hM2akNlTur7sJJ9jq/PsUH/4+vKpFqS5NCpktSRfaol6SDn0KmS1J13qiVJU/l14MYkHwC+zA8OnfrRNnTqDgaJOFV1b5LJoVN349Cpkg4iJtWSpO9z6FRJ2j92/5AkSZI6MqmWJEmSOjKpliRJkjoyqZYkSZI6MqmWJEmSOnL0D0lSLy27+JPT1tl6+VvmoSWSDgbeqZYkSZI6MqmWJEmSOjKpliRJkjqaNqlO8pIkX0jy10nuTfJbrXx5kjuTbElyU5LDWvmL2/qWtn3Z0LEuaeXfSHLGgQpKkiRJmk8zuVP9HPCmqnod8HpgdZJTgQ8CV1bVCcBO4MJW/0JgZyu/stUjyYnAucBrgNXAR5IcMpfBSJIkSQth2qS6Bna11Re1nwLeBNzSyjcCZ7flNW2dtv20JGnlN1bVc1X1ELAFOHlOopAkSZIW0IyG1Gt3lO8GTgB+D/gb4Kmq2t2qbAOWtuWlwCMAVbU7ydPA0a38jqHDDu8z/F7rgHUAY2NjTExMzC6iA2TXrl2Lpi1zre+xrV/5/LT1Ri3+vp+zvsYmSeqvGSXVVfU88PokS4BPAD91oBpUVRuADQCrVq2q8fHxA/VWszIxMcFiactc63tsV9z+zLT1tp43fuAbM4f6fs76Gpskqb9mNfpHVT0FfB54A7AkyWRSfhywvS1vB44HaNuPAJ4cLt/LPpIkSdLImsnoHz/W7lCT5KXAzwL3M0iu39aqrQVubcub2jpt++eqqlr5uW10kOXACuALcxWIJEmStFBm0v3jWGBj61f9I8DNVfXnSe4DbkzyAeDLwLWt/rXAR5NsAXYwGPGDqro3yc3AfcBu4KLWrUSSJEkaadMm1VX1VeCn91L+IHsZvaOqvgu8fYpjXQZcNvtmSpIkSYuXMypKkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdzWiacqnvll38yWnrbL38LfPQEkmSNIq8Uy1JkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR1ZFItSZIkdTRtUp3k+CSfT3JfknuTvLuVH5Vkc5IH2uuRrTxJrk6yJclXk5w0dKy1rf4DSdYeuLAkSZKk+TOTO9W7gfVVdSJwKnBRkhOBi4HbqmoFcFtbBzgTWNF+1gHXwCAJBy4FTgFOBi6dTMQlSZKkUTZtUl1Vj1bVl9ryd4D7gaXAGmBjq7YROLstrwGur4E7gCVJjgXOADZX1Y6q2glsBlbPaTSSJEnSAphVn+oky4CfBu4Exqrq0bbpMWCsLS8FHhnabVsrm6pckiRJGmmHzrRikpcDfwq8p6r+Lsn3t1VVJam5aFCSdQy6jTA2NsbExMRcHLazXbt2LZq2zLW+x7Z+5fNzcqzF9G/U93PW19gWqyTHA9czuDlSwIaquqp127sJWAZsBc6pqp0Z/AG4CjgLeBY4f/Ibzfa8zH9oh/5AVW1Ekg4CM0qqk7yIQUJ9Q1V9vBU/nuTYqnq0de94opVvB44f2v24VrYdGN+jfGLP96qqDcAGgFWrVtX4+PieVRbExMQEi6Utc63vsV1x+zNzcqyt543PyXHmQt/PWV9jW8Qmn535UpJXAHcn2Qycz+DZmcuTXMzg2Zlf5wefnTmFwbMzpww9O7OKQXJ+d5JNrcufJPXaTEb/CHAtcH9V/aehTZuAyRE81gK3DpW/o40CcirwdOsm8hng9CRHtgcUT29lkqQF5LMzktTdTO5UvxH4N8A9Sb7Syt4LXA7cnORC4GHgnLbtUwy+EtzC4GvBCwCqakeS9wN3tXrvq6odcxKFJGlOzNezM127+o29FNav3D2rffZmMXY16nsXqD7H1+fYoP/xdTVtUl1VtwOZYvNpe6lfwEVTHOs64LrZNFCSND/m69mZdrxOXf0+fMOtXHHPjB8LmtJi6tY1qe9doPocX59jg/7H15UzKkqS9vnsTNs+02dn9lYuSb1nUi1JBzmfnZGk7rp/dyZJGnU+OyNJHZlUS9JBzmdnJKk7u39IkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR05+oc0R5Zd/Mlp62y9/C3z0BJJkjTfvFMtSZIkdWRSLUmSJHVkUi1JkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR1NG1SneS6JE8k+dpQ2VFJNid5oL0e2cqT5OokW5J8NclJQ/usbfUfSLL2wIQjSZIkzb+Z3Kn+I2D1HmUXA7dV1QrgtrYOcCawov2sA66BQRIOXAqcApwMXDqZiEuSJEmjbtqkuqr+CtixR/EaYGNb3gicPVR+fQ3cASxJcixwBrC5qnZU1U5gMz+cqEuSJEkjaX9nVByrqkfb8mPAWFteCjwyVG9bK5uq/IckWcfgLjdjY2NMTEzsZxPn1q5duxZNW+Za32Nbv/L5OTnWdP9G61fu7nyMmer7OetrbFp8ppsJ1VlQJc1U52nKq6qS1Fw0ph1vA7ABYNWqVTU+Pj5Xh+5kYmKCxdKWudb32K64/Zk5OdbW88b3uf38mUxTPs0xZqrv56yvsUmS+mt/R/94vHXroL0+0cq3A8cP1TuulU1VLkmSJI28/U2qNwGTI3isBW4dKn9HGwXkVODp1k3kM8DpSY5sDyie3sokSZKkkTdt948kHwPGgWOSbGMwisflwM1JLgQeBs5p1T8FnAVsAZ4FLgCoqh1J3g/c1eq9r6r2fPhRWtSm63spSZIOXtMm1VX181NsOm0vdQu4aIrjXAdcN6vWSZIkSSPAGRUlSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjjpP/iJpcXKmOEmS5o93qiVJkqSOTKolSZKkjuz+IUnSFGYy6ZNdqSSBd6olSZKkzkyqJUmSpI5MqiVJkqSOTKolSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjhynWlJvOKawJGmhzHtSnWQ1cBVwCPAHVXX5fLdB/TFdErV+5W4W0/8d5yrpu2f705w/g2PNBxNZ7elg+5yfyTUwE14n0mib12wjySHA7wE/C2wD7kqyqarum892SIvZTP5Ar185Dw1h7pIFHTz8nJd0sJrvW3gnA1uq6kGAJDcCawA/bA8yJmva0+TvxPqVu6e8C++dvJHg5/x+msk1MBPTXSd+uyQdGKmq+Xuz5G3A6qr6xbb+b4BTquqdQ3XWAeva6quBb8xbA/ftGODbC92IA8TYRk9f44LFGdtPVNWPLXQjRsFMPudbedfP+sX4ezJX+hwb9Du+PscG/Y/v1VX1iv3defF0Nm2qagOwYaHbsackX6yqVQvdjgPB2EZPX+OCfsemF3T9rO/z70mfY4N+x9fn2ODgiK/L/vM9pN524Pih9eNamSSpH/ycl3RQmu+k+i5gRZLlSQ4DzgU2zXMbJEkHjp/zkmxx+S0AAAVnSURBVA5K89r9o6p2J3kn8BkGQy1dV1X3zmcbOlh0XVLmkLGNnr7GBf2Orffm8XO+z78nfY4N+h1fn2MD49uneX1QUZIkSeojpymXJEmSOjKpliRJkjoyqZ6hJOuTVJJj2nqSXJ1kS5KvJjlpods4W0n+Y5Kvt/Z/IsmSoW2XtNi+keSMhWzn/kiyurV9S5KLF7o9XSQ5Psnnk9yX5N4k727lRyXZnOSB9nrkQrd1fyQ5JMmXk/x5W1+e5M527m5qD7tJQL+ubej/9Q39vsaTLElyS/tben+SN/Tl3CX5lfY7+bUkH0vyklE+d0muS/JEkq8Nle31XO1vjmdSPQNJjgdOB745VHwmsKL9rAOuWYCmdbUZeG1V/WPg/wMuAUhyIoMn9l8DrAY+ksHUwyMhL0yTfCZwIvDzLaZRtRtYX1UnAqcCF7V4LgZuq6oVwG1tfRS9G7h/aP2DwJVVdQKwE7hwQVqlRaeH1zb0//qGfl/jVwGfrqqfAl7HIM6RP3dJlgK/DKyqqtcyeOj4XEb73P0Rg5xm2FTnar9yPJPqmbkS+DVg+KnONcD1NXAHsCTJsQvSuv1UVZ+tqt1t9Q4G48nCILYbq+q5qnoI2MJg6uFR8f1pkqvqH4DJaZJHUlU9WlVfasvfYfChvZRBTBtbtY3A2QvTwv2X5DjgLcAftPUAbwJuaVVGMi4dML26tqHf1zf0+xpPcgTwM8C1AFX1D1X1FD05dwxGiHtpkkOBlwGPMsLnrqr+CtixR/FU52q/cjyT6mkkWQNsr6q/3mPTUuCRofVtrWxU/VvgL9ryqMc26u2fUpJlwE8DdwJjVfVo2/QYMLZAzeridxj8h/V/tfWjgaeG/rPXm3OnOdHbaxt6eX1Dv6/x5cC3gD9s3Vv+IMnh9ODcVdV24EMMvqF/FHgauJv+nLtJU52r/fqsMakGkvxl6zO0588a4L3A/7PQbdxf08Q2Wec3GHwFecPCtVTTSfJy4E+B91TV3w1vq8HYmCM1PmaStwJPVNXdC90WaaH17fqGg+IaPxQ4Cbimqn4aeIY9unqM8Lk7ksHd2uXAq4DD+eGuE70yF+dqXid/Wayq6s17K0+yksEv1F8PvrHiOOBLSU5mRKbinSq2SUnOB94KnFYvDFo+ErHtw6i3/4ckeRGDP7g3VNXHW/HjSY6tqkfb11JPLFwL98sbgZ9LchbwEuBHGfRPXJLk0HY3ZOTPneZU765t6O31Df2/xrcB26rqzrZ+C4Okug/n7s3AQ1X1LYAkH2dwPvty7iZNda7267PGO9X7UFX3VNWPV9WyqlrG4AI6qaoeYzDt7jvaE6KnAk8PfYUwEpKsZvC13M9V1bNDmzYB5yZ5cZLlDDrqf2Eh2rifejVNcuuDeC1wf1X9p6FNm4C1bXktcOt8t62Lqrqkqo5r19a5wOeq6jzg88DbWrWRi0sHVK+ubejv9Q39v8ZbLvBIkle3otOA++jBuWPQ7ePUJC9rv6OTsfXi3A2Z6lztV47njIqzkGQrgydhv91+yX6XwdchzwIXVNUXF7J9s5VkC/Bi4MlWdEdV/VLb9hsM+lnvZvB15F/s/SiLU7sz8ju8ME3yZQvcpP2W5J8C/w24hxf6Jb6XQb/Lm4F/BDwMnFNVez6EMRKSjAP/V1W9NclPMngA7Sjgy8AvVNVzC9k+LR59urbh4Li+ob/XeJLXM3gI8zDgQeACBjcsR/7cJfkt4F8xyAO+DPwig37FI3nuknwMGAeOAR4HLgX+jL2cq/3N8UyqJUmSpI7s/iFJkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR1ZFItSZIkdWRSLUmSJHVkUi1JkiR19P8DcxZyk6YkUD8AAAAASUVORK5CYII=\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAs4AAAEICAYAAABPtXIYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAeUklEQVR4nO3df5Rkd1nn8ffHhAAmOhGiQ8hEJzgxuzHjrtIngXV/TBR0YjIEPSwmxtWwMXPi2bi6O6sbwF3RNWejS1YJiXJGEgc0ZshGxBkyCOraB1HAEH9sQkLWMQxkQkgIwkgPCAw++0fdhtpmuvtWV1VX1+3365ycdN1bde/zzK377ae/9/u9N1WFJEmSpKV9xaQDkCRJkqaBhbMkSZLUgoWzJEmS1IKFsyRJktSChbMkSZLUgoWzJEmS1IKFs9alJHNJnjPpOCRJkOR1Sf7LCLd3RZJ3jGp70jwLZ62aJFcmuS/Jp5N8NMmvJNmwCvudTfIj/cuq6pSqerhZvyfJz487DknqqiSVZMuCZa9K8pttPl9V11TVf2s+ty3J4eNs6/NNp8cnk/xpkucvsb3bq+q7VpKLtBQLZ62KJLuAXwB+EtgAPA/YDLwjyVMmGJokaTq8qapOAb4WeBfw5iRZ+KYkJ656ZFo3LJw1dkm+GvhZ4Meq6veq6vNVdQh4KfAc4AcW9vou7HFIcl2Sv0nyqSQPJPnevnVXJnlXklcn+USSDya5qFl3PfAvgJubnoqbm+WVZEuSncAVwE816/cn+ckkv70gh5uSvGZc/0aS1GXzbXqSXUmeSPJYkpf1rd+T5OeTnAy8DXh20ybPJXl2/7aq6vPAG4BnAc9sfgf8SZJfSvJx4FXzvxf6tv/NSX4/yd8meTzJK5rlX9H3++XjSe5M8ozV+DfRdLJw1mr4Z8DTgDf3L6yqOeAA0OZy2t/QK4A30CvCfzPJ6X3rLwAeAk4DfhG4NUmq6pXAHwPXNsMzrl0Qw27gduAXm/U7gN8Etic5Fb7Ye3EZ8MbB0pYk9XkWvTb8DOAq4JYkX9P/hqo6ClwEfKRpk0+pqo/0vyfJU4ErgUeq6slm8QXAw8BG4PoF7/8q4A+A3wOeDWwB/rBZ/WPAi4F/1az7BHDLKJJVN1k4azWcBjxZVceOs+4xepfdllRV/6uqPlJV/1BVbwL+Gji/7y0fqqpfq6ov0OuJOJ1eAzqwqnoMeCfwr5tF25v4713J9iRJAHwe+LnmquMBYA44Z4DPvzTJJ4FHgOcC39u37iNV9dqqOlZVn1nwuUuAj1bVjVX191X1qap6b7PuGuCVVXW4qj4LvAp4icM9tBgLZ62GJ4HTFmmITm/WLynJDyX5y2ZSyCeB8+gV5PM+Ov9DVX26+fGUIWJ+A/CDzc8/CPzGENuSpK77ArBwvspT6BXL8z6+oAPl0wzWTt9ZVadW1ddV1Xcs6Mx4ZInPnUnvquXxfAPwO32/Wx6kl8uKOl7UfRbOWg3vBj4LfF//wiSn0LskNwscBb6yb/Wz+t73DcCvAdcCz6yqU4H7gS+bFLKIWsH6twDfkuQ8er0Vt7fclyStRx+mN+G731nAh1awreXa7EE/8wi9+TSLrbuoKcjn/3taVT26ghi0Dlg4a+yq6gi9ccmvTbI9yVOSbAbupNfbfDvwl8D3JHlGkmcBP9G3iZPpNYofA2gmlJw3QAiPs3ijedz1VfX3wF3AbwF/VlUfHmB/krTevAn46SSbmgl3LwB20GtHB/U4vUl/o7pd6VuB05P8RJKnJvmqJBc0614HXN900JDka5NcOqL9qoMsnLUqquoXgVcArwY+BXyQXg/zC5rJIL8B/BVwCHgHvUZ4/rMPADfS67l+HNgK/MkAu38NvTFrn0hy03HW3wqc21yqe0vf8jc0+3KYhiQt7eeAP6V3m7hP0JukfUVV3T/ohqrqA8AdwMNNu/zs5T6zzPY+BbyQXiH/UXpzZC5sVr8G2Efv1qifAt5Db6KhdFypWskVEWk4Ta/xzwHfvlZ7c5N8PfAB4FlV9XeTjkeSJE2Ws0Y1EVX160mO0btV3ZornJN8BfAfgb0WzZIkCexxlr5McwP+x+lNatleVUvN1pYkSeuEhbMkSZLUgpMDJUmSpBbWxBjn0047rTZv3jzpMAA4evQoJ5988qTDGIuu5tbVvKC7ua3VvO69994nq2rZJ1lqZQZt69fq92RUzG96dTk36HZ+R48e5QMf+MCK2/o1UThv3ryZ973vfZMOA4DZ2Vm2bds26TDGoqu5dTUv6G5uazWvJCt5WINaGrStX6vfk1Exv+nV5dyg2/nNzs5y4YUXrritd6iGJEmS1MJYCuckJyd5X5JLxrF9SZIkabW1KpyT3JbkiST3L1i+PclDSQ4mua5v1X+m9zhlSZIkqRPa9jjvAbb3L0hyAnALcBFwLnB5knOTvBB4AHhihHFKktYYry5KWm9aTQ6sqncm2bxg8fnAwap6GCDJXuBS4BTgZHrF9GeSHKiqf1i4zSQ7gZ0AGzduZHZ2doUpjNbc3NyaiWXUuppbV/OC7ubW1bymXZLbgEuAJ6rqvL7l24HXACcAr6+qG5pVXl2UtK4Mc1eNM4D+J6odBi6oqmsBklwJPHm8ohmgqnYDuwFmZmZqrcze7PpM0i7m1tW8oLu5dTWvDtgD3Ay8cX5B39XFF9Jr5+9Jso/e74AHgKetfpiSNBljux1dVe0Z17YlSaO31q4udv3KhPlNry7nBt3Ob25ubqjPD1M4Pwqc2fd6U7OstSQ7gB1btmwZIgxJ0hhN7Opi169MmN/06nJu0O38hv2DYJjC+R7g7CRn0SuYLwN+YJANVNV+YP/MzMzVQ8QhAbD5uruXXH/ohotXKRJp/WhzdXGYTpLlzmvw3Ja0etreju4O4N3AOUkOJ7mqqo4B1wJvBx4E7qyq9w+y8yQ7kuw+cuTIoHFLklbH0FcXq2p/Ve3csGHDSAOTpNXW9q4aly+y/ABwYKU7t8dZkta8oa8uSlJX+MhtSRLg1UVJWs7Y7qrRhpMDJWnt8OqiJC1toj3OjnuTJEnStHCohiRprByqIakrLJwlSWPl1UVJXTHRwtleCEmSJE0LxzhLksbKThJJXeFQDUnSWNlJIqkrLJwlSZKkFhzjLEmSJLXgGGdJ0ljZSSKpKxyqIUkaKztJJHWFhbMkSZLUgoWzJEmS1IKTAyVJkqQWnBwoSRorO0kkdYVDNSRJY2UniaSusHCWJEmSWjhx0gFIkjSMzdfdveT6QzdcvEqRSOo6e5wlSZKkFryrhiRJktSCd9WQJI2VnSSSusKhGpKksbKTRFJXWDhLkiRJLXhXDa0by828B2ffS5KkxdnjLEmSJLVg4SxJkiS1YOEsSZIktWDhLEmSJLXgA1AkSZKkFnwAiiRprOwkkdQVDtWQJI2VnSSSusLCWZIkSWrBwlmSJElqwcJZkiRJasHCWZIkSWrhxEkHIE2bzdfdvex7Dt1w8SpEIkmSVpM9zpIkSVILFs6SJElSCxbOkiRJUgsWzpIkSVILIy+ck/zjJK9LcleSHx319iVJkqRJaFU4J7ktyRNJ7l+wfHuSh5IcTHIdQFU9WFXXAC8Fvn30IUuSJs1OEknrUdse5z3A9v4FSU4AbgEuAs4FLk9ybrPuRcDdwIGRRSpJGis7SSRpaa3u41xV70yyecHi84GDVfUwQJK9wKXAA1W1D9iX5G7gt463zSQ7gZ0AGzduZHZ2diXxj9zc3NyaiWXUuprbfF67th4beltt/n3a7GdU/85dP2Zac/YANwNvnF/Q10nyQuAwcE+SfVX1QNNJ8qPAb0wg1ta897qkURnmAShnAI/0vT4MXJBkG/B9wFNZose5qnYDuwFmZmZq27ZtQ4QyOrOzs6yVWEatq7nN53Vli1+Oyzl0xbZl39NmP22200bXj5nWlrXWSTI3N8eurV8YMIuVmcQfcl3/A7LL+XU5N+h2fnNzc0N9fuRPDqyqWWC2zXuT7AB2bNmyZdRhSJJGY2KdJLOzs9z4rqODR7wCo/pjdxBd/wOyy/l1OTfodn7D/kEwTOH8KHBm3+tNzbLWqmo/sH9mZubqIeKQJK0yO0kkrUfDFM73AGcnOYtewXwZ8AMjiUqakDZjIaV1xk4SSWq0vR3dHcC7gXOSHE5yVVUdA64F3g48CNxZVe8fZOdJdiTZfeTIkUHjliStji92kiQ5iV4nyb4JxyRJE9GqcK6qy6vq9Kp6SlVtqqpbm+UHquqbquobq+r6QXdeVfuraueGDRsG/agkacTsJJGkpY18cqAkaTpV1eWLLD/AEPfld6iGpK4Y+SO3B2EvhCRJkqbFRAtnh2pIUvfZSSKpKyZaOEuSus9OEkld4VANSZIkqYWJTg50wog0nDb3nT50w8WrEIm0OB+AIqkrHKohSRorh2pI6goLZ0mSJKmFiQ7V8PKdJGktcNiTpDa8HZ0kaaycCC6pKxyqIUkaKztJJHWFhbMkSZLUgoWzJEmS1IIPQJEkSZJacHKgJGms7CSR1BUO1ZAkjZWdJJK6wsJZkiRJasHCWZIkSWrBwlmSJElqwbtqSJIkSS14Vw1J0ljZSSKpKxyqIUkaKztJJHWFhbMkSZLUgoWzJEmS1IKFsyRJktSChbMkSZLUgoWzJEmS1IKFsyRJktSCD0CRJEmSWvABKJKksbKTRFJXOFRDkjRWdpJI6goLZ0mSJKkFC2dJkiSpBQtnSZIkqYUTJx2AJA1q83V3L/ueQzdcvAqRSJLWE3ucJUmSpBYsnCVJkqQWLJwlSZKkFhzjLEnSiDj+Xuo2C2dJrVgQSJLWu7EUzkleDFwMfDVwa1W9Yxz7kSRJklZL6zHOSW5L8kSS+xcs357koSQHk1wHUFVvqaqrgWuA7x9tyJKkSUvy4iS/luRNSb5r0vFI0moYpMd5D3Az8Mb5BUlOAG4BXggcBu5Jsq+qHmje8tPNeknSGpfkNuAS4ImqOq9v+XbgNcAJwOur6oaqegvwliRfA7wa6PyVxTbDlSR1W+vCuaremWTzgsXnAwer6mGAJHuBS5M8CNwAvK2q/vx420uyE9gJsHHjRmZnZwcOfhzm5ubWTCyj1tXc5vPatfXYpEP5otfe/rvLvmfrGRuWfc9yx6xNzqM65qPc17DfxdXMe53Zgx0kkrSoYcc4nwE80vf6MHAB8GPAC4ANSbZU1esWfrCqdgO7AWZmZmrbtm1DhjIas7OzrJVYRm2ac1uqp2fX1i9w47uOMm1zXQ9dsW3Z97z29t9tclvM8jm32U8bV7aZHNhyX8N+F0cZi75k1B0kzftX3EkyNzfHrq1fGCSFqTD/b9DVzox5Xc6vy7lBt/Obm5sb6vNjqTSq6ibgpuXel2QHsGPLli3jCEOSNLwVd5DAcJ0ks7Ozy/zhOJ3m/6ib5s6MNrqcX5dzg27nN+wfBMM+AOVR4My+15uaZa1U1f6q2rlhw/KXrCVJa0dV3VRVz62qaxYrmucl2ZFk95EjR1YrPEkai2EL53uAs5OcleQk4DJg3/BhSZLWiKE6SMBOEkndMcjt6O4A3g2ck+Rwkquq6hhwLfB24EHgzqp6/wDbtBdCktY2O0gkqdG6cK6qy6vq9Kp6SlVtqqpbm+UHquqbquobq+r6QXZuL4QkrR3j6CBptmsniaROmK7bEEiSxqaqLl9k+QHgwBDb3Q/sn5mZuXql25CktWDYMc5DsRdCkiRJ02KihbNDNSSp++wkkdQVDtWQJqTN43t3bV2FQKQxc6iGpK6YaOHsA1CktaFNES9J0nrnUA1J0lg5VENSV0y0cJYkdZ+dJJK6wsJZkiRJasHb0UmSJEktOMZZkjRWdpJI6gqHakiSxspOEkld4X2cJUlaRfO3f9y19RhXLnIryEM3XLyaIUlqyR5nSZIkqQUnB0qSJEktODlQkjRWdpJI6gqHakiSxspOEkldYeEsSZIkteBdNSStGu8mIEmaZvY4S5IkSS1YOEuSJEktTHSoRpIdwI4tW7ZMMgxN2OZFLtlL6gbbekldMdHCuar2A/tnZmaunmQckqTxsa0f3HIdCs4FkCbDoRqSJElSCxbOkiRJUgsWzpIkSVILFs6SJElSCxbOkiRJUgsWzpIkSVILEy2ck+xIsvvIkSOTDEOSJEla1kQL56raX1U7N2zYMMkwJEljZCeJpK5wqIYkaazsJJHUFRbOkiRJUgsTfeS2um+5x8ZKkiRNC3ucJUmSpBYsnCVJkqQWHKohSVIHtRkqd+iGi1chEqk77HGWJEmSWrBwliRJklpwqIYkSeuUwzmkwdjjLEmSJLUw8sI5yXOS3JrkrlFvW5IkSZqUVoVzktuSPJHk/gXLtyd5KMnBJNcBVNXDVXXVOIKVJK0NdpJIWo/a9jjvAbb3L0hyAnALcBFwLnB5knNHGp0kadXYSSJJS2tVOFfVO4G/XbD4fOBg03h+DtgLXDri+CRJq2cPdpJI0qJSVe3emGwG3lpV5zWvXwJsr6ofaV7/G+AC4GeA64EXAq+vqv++yPZ2AjsBNm7c+Ny9e/cOlciozM3Nccopp0w6jLGYRG73PXpk7PvY+HR4/DNj381ETFtuW8/YsOT6+e/DUnktt43+7QwTy/FceOGF91bVzMAf7JDjtPXPB15VVd/dvH45wHzbnuSuqnrJEttbcVs/NzfHB498YWWJTIFhzu9RnSej2tfx+Pt0enU5v7m5OXbs2LHitn7kt6Orqo8D17R4325gN8DMzExt27Zt1KGsyOzsLGslllGbRG5XtrjV0bB2bT3Gjfd1886K05bboSu2Lbl+/vuwVF7LbaN/O8PEotbOAB7pe30YuCDJM+l1knxrkpcv1kkyTFs/OzvLje86utK417xhzu9RnSej2tfx+Pt0enU5v9nZ2aE+P8xv5EeBM/teb2qWtZZkB7Bjy5YtQ4ShSWlz/09J3dS2kwRs6yV1xzC3o7sHODvJWUlOAi4D9g2ygaraX1U7N2xY2WUgSdLYDd1JYlsvqSva3o7uDuDdwDlJDie5qqqOAdcCbwceBO6sqvcPsvMkO5LsPnJk/ONgJUkrMnQniSR1RauhGlV1+SLLDwAHVrrzqtoP7J+Zmbl6pduQJI1G00myDTgtyWHgZ6rq1iTznSQnALetpJMEh2pMLR/LLX3J9Mw6kiSNlZ0kkrS0kT9yexAO1ZAkSdK0mGjh7IQRSeo+O0kkdcVEC2dJUvfZSSKpKyycJUmSpBYmOjnQmdaS1H229aPnA6ikyXCMsyRprGzrJXWFQzUkSZKkFiycJUmSpBYc4zxlfIKTpGljWy+pKxzjLEkaK9t6SV3hUA1JkiSpBQtnSZIkqQULZ0mSJKkFJwdKksbKtl6j4gR5TZqTAyVJY2VbL6krHKohSZIktWDhLEmSJLVg4SxJkiS1YOEsSZIktTC1d9VYzZm1zuKV2mlzrmj98a4akrrCu2pIksbKtl5SVzhUQ5IkSWrBwlmSJElqwcJZkiRJasHCWZIkSWrBwlmSJElqwcJZkiRJasHCWZIkSWphah+AIkmaDrb13Xe8hx/t2nqMK5vlq/mQMB9apnHyASiSpLGyrZfUFQ7VkCRJklqwcJYkSZJasHCWJEmSWrBwliRJklqwcJYkSZJasHCWJEmSWrBwliRJklqwcJYkSZJasHCWJEmSWrBwliRJklo4cdQbTHIy8CvA54DZqrp91PuQJE2Wbb2k9ahVj3OS25I8keT+Bcu3J3koycEk1zWLvw+4q6quBl404nglSWNiWy9JS2s7VGMPsL1/QZITgFuAi4BzgcuTnAtsAh5p3vaF0YQpSVoFe7Ctl6RFparavTHZDLy1qs5rXj8feFVVfXfz+uXNWw8Dn6iqtybZW1WXLbK9ncBOgI0bNz537969AwV+36NHln3P1jM2DLRNgLm5OU455ZRV2ddKtIllMRufDo9/pvfzKOIdJpZR6s+ra7qa21J5tflujuucvPDCC++tqpmBP9gha6mtn5ub44NHuluTd/X8njdofqM691djXxufDl/3jNX5vT8qg/zbDdtGj8pyMa+0ztuxY8eK2/phxjifwZd6G6DXiF4A3ATcnORiYP9iH66q3cBugJmZmdq2bdtAO7/yuruXfc+hKwbbJsDs7CwLYxnXvlaiTSyL2bX1GDfe1zvko4h3mFhGqT+vrulqbkvl1ea7uZbOyXVgYm397OwsN77r6ApCng5dPb/nDZrfqM791djXrq3HeOmAdcukDfJvN2wbPSrLxbzSOm8YIz9jq+oo8LI2702yA9ixZcuWUYchSRoj23pJ69Ewt6N7FDiz7/WmZllrVbW/qnZu2DBdlzskaR2xrZekxjCF8z3A2UnOSnIScBmwbzRhSZLWCNt6SWq0vR3dHcC7gXOSHE5yVVUdA64F3g48CNxZVe8fZOdJdiTZfeTI2phkJknrmW29JC2t1Rjnqrp8keUHgAMr3XlV7Qf2z8zMXL3SbUiSRsO2XpKW5iO3JUmSpBYmWjh7+U6Sus+2XlJXTLRwdqa1JHWfbb2krmj95MCxBpF8DPjQpONonAY8OekgxqSruXU1L+hubms1r2+oqq+ddBBdtYK2fq1+T0bF/KZXl3ODbud3GnDyStv6NVE4ryVJ3tfVR+52Nbeu5gXdza2reWm0uv49Mb/p1eXcoNv5DZubkwMlSZKkFiycJUmSpBYsnL/c7kkHMEZdza2reUF3c+tqXhqtrn9PzG96dTk36HZ+Q+XmGGdJkiSpBXucJUmSpBYsnCVJkqQWLJwXSLIrSSU5rXmdJDclOZjk/yT5tknHOIgk/yPJB5rYfyfJqX3rXt7k9VCS755knCuVZHsT/8Ek1006npVKcmaSP0ryQJL3J/nxZvkzkvx+kr9u/v81k451pZKckOQvkry1eX1Wkvc2x+5NSU6adIxaO7pybsP6OL+h2+d4klOT3NX8Pn0wyfO7cvyS/Ifme3l/kjuSPG2aj12S25I8keT+vmXHPVYrqfEsnPskORP4LuDDfYsvAs5u/tsJ/OoEQhvG7wPnVdW3AP8XeDlAknOBy4BvBrYDv5LkhIlFuQJNvLfQO0bnApc3eU2jY8CuqjoXeB7w75pcrgP+sKrOBv6weT2tfhx4sO/1LwC/VFVbgE8AV00kKq05HTu3YX2c39Dtc/w1wO9V1T8C/gm9PKf++CU5A/j3wExVnQecQK82mOZjt4deXdNvsWM1cI1n4fz/+yXgp4D+GZOXAm+snvcApyY5fSLRrUBVvaOqjjUv3wNsan6+FNhbVZ+tqg8CB4HzJxHjEM4HDlbVw1X1OWAvvbymTlU9VlV/3vz8KXqN8hn08nlD87Y3AC+eTITDSbIJuBh4ffM6wHcAdzVvmdrcNBadObeh++c3dPscT7IB+JfArQBV9bmq+iTdOX4nAk9PciLwlcBjTPGxq6p3An+7YPFix2rgGs/CuZHkUuDRqvqrBavOAB7pe324WTaN/i3wtubnLuTVhRy+TJLNwLcC7wU2VtVjzaqPAhsnFNawfpneH6X/0Lx+JvDJvj/qOnHsNDKdPLehs+c3dPscPwv4GPDrzVCU1yc5mQ4cv6p6FHg1vSvtjwFHgHvpzrGbt9ixGritWVeFc5I/aMbwLPzvUuAVwH+ddIwrsUxe8+95Jb3LhbdPLlItJ8kpwG8DP1FVf9e/rnr3jpy6+0cmuQR4oqrunXQs0iR18fyGdXGOnwh8G/CrVfWtwFEWDMuY1uPXjPW9lN4fB88GTubLhzl0yrDH6sQRxrLmVdULjrc8yVZ6X5q/6l1dYhPw50nOBx4Fzux7+6Zm2ZqxWF7zklwJXAJ8Z33pxt1rPq8WupDDFyV5Cr1fqrdX1ZubxY8nOb2qHmsuHz0xuQhX7NuBFyX5HuBpwFfTGy94apITm16NqT52GrlOndvQ6fMbun+OHwYOV9V7m9d30Sucu3D8XgB8sKo+BpDkzfSOZ1eO3bzFjtXAbc266nFeTFXdV1VfV1Wbq2ozvZPk26rqo8A+4IeamZfPA470dfeveUm207t89qKq+nTfqn3AZUmemuQsegPj/2wSMQ7hHuDsZvbvSfQmNOybcEwr0owHvBV4sKr+Z9+qfcAPNz//MPC7qx3bsKrq5VW1qTm3LgP+d1VdAfwR8JLmbVOZm8amM+c2dPv8hu6f400t8EiSc5pF3wk8QDeO34eB5yX5yuZ7Op9bJ45dn8WO1cA1nk8OPI4kh+jNMH2y+SLdTO/SxaeBl1XV+yYZ3yCSHASeCny8WfSeqrqmWfdKeuOej9G7dPi2429l7Wp6OH6Z3kzg26rq+gmHtCJJ/jnwx8B9fGmM4CvojYO8E/h64EPAS6tq4aSHqZFkG/CfquqSJM+hN+nrGcBfAD9YVZ+dZHxaO7pybsP6Ob+hu+d4kn9Kb+LjScDDwMvodT5O/fFL8rPA99OrBf4C+BF643yn8tgluQPYBpwGPA78DPAWjnOsVlLjWThLkiRJLThUQ5IkSWrBwlmSJElqwcJZkiRJasHCWZIkSWrBwlmSJElqwcJZkiRJasHCWZIkSWrh/wHCTZ0L7i2IJgAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30, log=True)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1228</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15485.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22086</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1237</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22444</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1286</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84050</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1293</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22197</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1333</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>18144.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84879</td>\n",
|
||
" <td>1.69</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14784</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22423</td>\n",
|
||
" <td>10.95</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14785</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22075</td>\n",
|
||
" <td>1.45</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14788</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>17038</td>\n",
|
||
" <td>0.07</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14974</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21704</td>\n",
|
||
" <td>0.72</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14980</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22178</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>258 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1228 United Kingdom 15485.0 ... 22086 2.55\n",
|
||
"1237 Norway 12433.0 ... 22444 1.06\n",
|
||
"1286 Norway 12433.0 ... 84050 1.25\n",
|
||
"1293 Norway 12433.0 ... 22197 0.85\n",
|
||
"1333 United Kingdom 18144.0 ... 84879 1.69\n",
|
||
"... ... ... ... ... ...\n",
|
||
"14784 United Kingdom 15061.0 ... 22423 10.95\n",
|
||
"14785 United Kingdom 15061.0 ... 22075 1.45\n",
|
||
"14788 United Kingdom 15061.0 ... 17038 0.07\n",
|
||
"14974 United Kingdom 14739.0 ... 21704 0.72\n",
|
||
"14980 United Kingdom 14739.0 ... 22178 1.06\n",
|
||
"\n",
|
||
"[258 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.query('Quantity>50 & UnitPrice<100')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Arithmetic Operations"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Numeric values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1\n",
|
||
"1001 1\n",
|
||
"1002 1\n",
|
||
"1003 1\n",
|
||
"1004 12\n",
|
||
"Name: Quantity, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 0.29\n",
|
||
"Name: UnitPrice, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['UnitPrice'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"product = df['Quantity'] * df['UnitPrice']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 3.48\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"product.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"String concatenation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 United Kingdom21123\n",
|
||
"1001 United Kingdom21124\n",
|
||
"1002 United Kingdom21122\n",
|
||
"1003 United Kingdom84378\n",
|
||
"1004 United Kingdom21985\n",
|
||
" ... \n",
|
||
"14995 United Kingdom72349B\n",
|
||
"14996 United Kingdom72741\n",
|
||
"14997 United Kingdom22762\n",
|
||
"14998 United Kingdom21773\n",
|
||
"14999 United Kingdom22149\n",
|
||
"Length: 15000, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'] + df['StockCode']"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.7.5"
|
||
},
|
||
"pycharm": {
|
||
"stem_cell": {
|
||
"cell_type": "raw",
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"source": []
|
||
}
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|