mirror of
https://github.com/elastic/eland.git
synced 2025-07-11 00:02:14 +08:00
1454 lines
64 KiB
Plaintext
1454 lines
64 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import eland as ed\n",
|
||
"import pandas as pd\n",
|
||
"import numpy as np\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"# Fix console size for consistent test results\n",
|
||
"from eland.conftest import *"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Online Retail Analysis"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Getting Started\n",
|
||
"\n",
|
||
"To get started, let's create an `eland.DataFrame` by reading a csv file. This creates and populates the \n",
|
||
"`online-retail` index in the local Elasticsearch cluster."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"df = ed.read_csv(\"data/online-retail.csv.gz\",\n",
|
||
" es_client='localhost', \n",
|
||
" es_dest_index='online-retail', \n",
|
||
" es_if_exists='replace', \n",
|
||
" es_dropna=True,\n",
|
||
" es_refresh=True,\n",
|
||
" compression='gzip',\n",
|
||
" index_col=0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here we see that the `\"_id\"` field was used to index our data frame. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'_id'"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.index.es_index_field"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next, we can check which field from elasticsearch are available to our eland data frame. `columns` is available as a parameter when instantiating the data frame which allows one to choose only a subset of fields from your index to be included in the data frame. Since we didn't set this parameter, we have access to all fields."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode',\n",
|
||
" 'UnitPrice'],\n",
|
||
" dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.columns"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now, let's see the data types of our fields. Running `df.dtypes`, we can see that elasticsearch field types are mapped to pandas field types."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Country object\n",
|
||
"CustomerID float64\n",
|
||
"Description object\n",
|
||
"InvoiceDate object\n",
|
||
"InvoiceNo object\n",
|
||
"Quantity int64\n",
|
||
"StockCode object\n",
|
||
"UnitPrice float64\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.dtypes"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We also offer a `.info_es()` data frame method that shows all info about the underlying index. It also contains information about operations being passed from data frame methods to elasticsearch. More on this later."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"es_index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" es_index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: []\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {}\n",
|
||
" post_processing: []\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Selecting and Indexing Data\n",
|
||
"\n",
|
||
"Now that we understand how to create a data frame and get access to it's underlying attributes, let's see how we can select subsets of our data."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### head and tail\n",
|
||
"\n",
|
||
"much like pandas, eland data frames offer `.head(n)` and `.tail(n)` methods that return the first and last n rows, respectively."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21123</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21124</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1000 United Kingdom 14729.0 ... 21123 1.25\n",
|
||
"1001 United Kingdom 14729.0 ... 21124 1.25\n",
|
||
"\n",
|
||
"[2 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.head(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"es_index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" es_index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: [('tail': ('sort_field': '_doc', 'count': 2)), ('head': ('sort_field': '_doc', 'count': 2)), ('tail': ('sort_field': '_doc', 'count': 2))]\n",
|
||
" size: 2\n",
|
||
" sort_params: _doc:desc\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {}\n",
|
||
" post_processing: [('sort_index'), ('head': ('count': 2)), ('tail': ('count': 2))]\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df.tail(2).head(2).tail(2).info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>14998</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21773</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14999</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>17419.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22149</td>\n",
|
||
" <td>2.10</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>2 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"14998 United Kingdom 17419.0 ... 21773 1.25\n",
|
||
"14999 United Kingdom 17419.0 ... 22149 2.10\n",
|
||
"\n",
|
||
"[2 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.tail(2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### selecting columns\n",
|
||
"\n",
|
||
"you can also pass a list of columns to select columns from the data frame in a specified order."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>InvoiceDate</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>2010-12-01 12:43:00</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 2 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country InvoiceDate\n",
|
||
"1000 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1001 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1002 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1003 United Kingdom 2010-12-01 12:43:00\n",
|
||
"1004 United Kingdom 2010-12-01 12:43:00\n",
|
||
"\n",
|
||
"[5 rows x 2 columns]"
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Country', 'InvoiceDate']].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Boolean Indexing\n",
|
||
"\n",
|
||
"we also allow you to filter the data frame using boolean indexing. Under the hood, a boolean index maps to a `terms` query that is then passed to elasticsearch to filter the index."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'term': {'Country': 'Germany'}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1109</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22809</td>\n",
|
||
" <td>2.95</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1110</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84347</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1111</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84945</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1112</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22242</td>\n",
|
||
" <td>1.65</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1113</th>\n",
|
||
" <td>Germany</td>\n",
|
||
" <td>12662.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22244</td>\n",
|
||
" <td>1.95</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1109 Germany 12662.0 ... 22809 2.95\n",
|
||
"1110 Germany 12662.0 ... 84347 2.55\n",
|
||
"1111 Germany 12662.0 ... 84945 0.85\n",
|
||
"1112 Germany 12662.0 ... 22242 1.65\n",
|
||
"1113 Germany 12662.0 ... 22244 1.95\n",
|
||
"\n",
|
||
"[5 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# the construction of a boolean vector maps directly to an elasticsearch query\n",
|
||
"print(df['Country']=='Germany')\n",
|
||
"df[(df['Country']=='Germany')].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"we can also filter the data frame using a list of values."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"{'terms': {'Country': ['Germany', 'United States']}}\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21123</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1001</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21124</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1002</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21122</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1003</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84378</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1004</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14729.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21985</td>\n",
|
||
" <td>0.29</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>5 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1000 United Kingdom 14729.0 ... 21123 1.25\n",
|
||
"1001 United Kingdom 14729.0 ... 21124 1.25\n",
|
||
"1002 United Kingdom 14729.0 ... 21122 1.25\n",
|
||
"1003 United Kingdom 14729.0 ... 84378 1.25\n",
|
||
"1004 United Kingdom 14729.0 ... 21985 0.29\n",
|
||
"\n",
|
||
"[5 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df['Country'].isin(['Germany', 'United States']))\n",
|
||
"df[df['Country'].isin(['Germany', 'United Kingdom'])].head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"We can also combine boolean vectors to further filter the data frame."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>0 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
"Empty DataFrame\n",
|
||
"Columns: [Country, CustomerID, Description, InvoiceDate, InvoiceNo, Quantity, StockCode, UnitPrice]\n",
|
||
"Index: []\n",
|
||
"\n",
|
||
"[0 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Country']=='Germany') & (df['Quantity']>90)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Using this example, let see how eland translates this boolean filter to an elasticsearch `bool` query."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"es_index_pattern: online-retail\n",
|
||
"Index:\n",
|
||
" es_index_field: _id\n",
|
||
" is_source_field: False\n",
|
||
"Mappings:\n",
|
||
" capabilities:\n",
|
||
" es_field_name is_source es_dtype es_date_format pd_dtype is_searchable is_aggregatable is_scripted aggregatable_es_field_name\n",
|
||
"Country Country True keyword None object True True False Country\n",
|
||
"CustomerID CustomerID True double None float64 True True False CustomerID\n",
|
||
"Description Description True keyword None object True True False Description\n",
|
||
"InvoiceDate InvoiceDate True keyword None object True True False InvoiceDate\n",
|
||
"InvoiceNo InvoiceNo True keyword None object True True False InvoiceNo\n",
|
||
"Quantity Quantity True long None int64 True True False Quantity\n",
|
||
"StockCode StockCode True keyword None object True True False StockCode\n",
|
||
"UnitPrice UnitPrice True double None float64 True True False UnitPrice\n",
|
||
"Operations:\n",
|
||
" tasks: [('boolean_filter': ('boolean_filter': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}))]\n",
|
||
" size: None\n",
|
||
" sort_params: None\n",
|
||
" _source: ['Country', 'CustomerID', 'Description', 'InvoiceDate', 'InvoiceNo', 'Quantity', 'StockCode', 'UnitPrice']\n",
|
||
" body: {'query': {'bool': {'must': [{'term': {'Country': 'Germany'}}, {'range': {'Quantity': {'gt': 90}}}]}}}\n",
|
||
" post_processing: []\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(df[(df['Country']=='Germany') & (df['Quantity']>90)].info_es())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Aggregation and Descriptive Statistics\n",
|
||
"\n",
|
||
"Let's begin to ask some questions of our data and use eland to get the answers."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**How many different countries are there?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"16"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'].nunique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**What is the total sum of products ordered?**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"111960.0"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].sum()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me the sum, mean, min, and max of the qunatity and unit_price fields**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>sum</th>\n",
|
||
" <td>111960.000</td>\n",
|
||
" <td>61548.490000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>7.464</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>2880.000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>-9360.000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Quantity UnitPrice\n",
|
||
"sum 111960.000 61548.490000\n",
|
||
"mean 7.464 4.103233\n",
|
||
"max 2880.000 950.990000\n",
|
||
"min -9360.000 0.000000"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[['Quantity','UnitPrice']].agg(['sum', 'mean', 'max', 'min'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Give me descriptive statistics for the entire data frame**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>Quantity</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>count</th>\n",
|
||
" <td>10729.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" <td>15000.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>mean</th>\n",
|
||
" <td>15590.776680</td>\n",
|
||
" <td>7.464000</td>\n",
|
||
" <td>4.103233</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>std</th>\n",
|
||
" <td>1764.025160</td>\n",
|
||
" <td>85.924387</td>\n",
|
||
" <td>20.104873</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>min</th>\n",
|
||
" <td>12347.000000</td>\n",
|
||
" <td>-9360.000000</td>\n",
|
||
" <td>0.000000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>25%</th>\n",
|
||
" <td>14220.777549</td>\n",
|
||
" <td>1.000000</td>\n",
|
||
" <td>1.250000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>50%</th>\n",
|
||
" <td>15656.783333</td>\n",
|
||
" <td>2.000000</td>\n",
|
||
" <td>2.510000</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>75%</th>\n",
|
||
" <td>17214.162905</td>\n",
|
||
" <td>6.607062</td>\n",
|
||
" <td>4.216043</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>max</th>\n",
|
||
" <td>18239.000000</td>\n",
|
||
" <td>2880.000000</td>\n",
|
||
" <td>950.990000</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" CustomerID Quantity UnitPrice\n",
|
||
"count 10729.000000 15000.000000 15000.000000\n",
|
||
"mean 15590.776680 7.464000 4.103233\n",
|
||
"std 1764.025160 85.924387 20.104873\n",
|
||
"min 12347.000000 -9360.000000 0.000000\n",
|
||
"25% 14220.777549 1.000000 1.250000\n",
|
||
"50% 15656.783333 2.000000 2.510000\n",
|
||
"75% 17214.162905 6.607062 4.216043\n",
|
||
"max 18239.000000 2880.000000 950.990000"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# NBVAL_IGNORE_OUTPUT\n",
|
||
"df.describe()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**Show me a histogram of numeric columns**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAtUAAAEICAYAAACQ+wgHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nO3df7RfdX3n++dLEH+gEn50jhiYJh0yutCMymQBjnN7z4iFgE7D3KUMHToELl25cxdabTNtoZ17uVVZCzsyFGxlJlOowTICpVq4laoZ9NwOMxdE1IqAXlIIkjSAmoANjNQw7/vH93Pka8zhnJN9cs757jwfa5119v7sz97fzzv77O/3nf397M8nVYUkSZKkffeihW6AJEmSNOpMqiVJkqSOTKolSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjkyqpSFJdiX5mYVuhyRpIMm/T/J/LNbjSZNMqrXgkpyX5N4kzyR5LMnHkhw2D687keSXhsuq6hVV9VDb/vEkH9rf7ZCkPktSSY7bo+z/SvJHM9m/qv5VVX2w7TeeZOtejvXDdlPkyST/LclbZnI8aS6ZVGtBJVkPfBj4NeAw4GRgGfD5JC9ewKZJkkbHjVX1CuCngDuATyXJnpWSHDTvLdMBw6RaCybJq4DfBt5bVZ+tqh9W1RbgLOBngH+x593iPe9SJLkoyV8l+Zsk9yf5Z0PbzktyR5KPJNmZ5OEkp7dtlwL/E/B77e7G77XySnJcknXAOcCvt+3/d5JfS/Ine8RwVZIr99e/kST13eT7epL1SZ5Isj3J+UPbP57kQ0kOBf4ceE17X96V5DXDx6qqHwIbgVcDR7Z9r05yW5KngX+yl8+VNUm+luT77fNkdSs/LMk1rT3bWhtMyjUlk2otpH8EvBT41HBhVe0CbgNOncEx/opBcnwYgwT9j5IcPbT9JOBbwFHA7wDXJElV/RbwX4D3tC4f79mjDRuA64Hfadv/KfBHwOokSwCSHAycDVw3u7AlSXt4NYP38aXABcDvJzl8uEJVPQ2cDvx1e19+RVX99XCdJC8BzgMerarvtuJ/AVwKvJLBXezh+icyeA//NWAJ8LPAlrb548Bu4DjgzQw+k36sy6A0zKRaC+ko4LtVtXsv27Yz+BrvBVXVH1fVX1fV/6iqG4EHgROHqjxSVf+xqp5jcPfiaGBsXxpbVduBvwDe3YpWt/bfsy/HkyT9yA+BD7RvLG8DdgGvncX+ZyV5EngU+IfAPxvadktV/df2OfGDPfa7ALi2qja17duq6ptJxoAzgPdX1dNV9QRwBYMbKdJeHbzQDdAB7bvAUUkO3ktifXTb/oKSnAv8KoN+2ACvYJCsT3pscqGqnmld7F7Roc0bgf8d+I/ALwKf6HAsSToQPAfs+YzMixkk0pO+t8fnwDPM7r36pqr6xSm2PfoC+x3L4JvRPf10a+P2oa7ZL5rmWDrAeadaC+n/BZ4F/pfhwiSvYPAV3wTwNPDyoc2vHqr30wyS2/cAR1bVEuAbwE88nDKF2oftfwr8gyRvAN7JoIuIJGlq3+b5Gx+TlgOP7MOxpnvfnu0+jwJ/b4ryZ4GjqmpJ+3lVVb1+H15fBwiTai2YqnqKQT/ojyZZneTFSZYBNzG4S3098DXgjCRHJHk18P6hQxzK4M3yOwDtwZY3zKIJjzN4IHLG29tXhzcD/wn4UlV9exavJ0kHohuBf5PkmCQvSvJ24J8yeC+drccZPIA4V8OuXgOcn+SU1ralSV7Xuvt9Hrg8yavatr+X5H+eo9dVD5lUa0FV1e8Avwl8BPgb4GEGd6bf3h5K+QTwlwweHPk8gzfnyX3vBy5ncMf7cWAl8F9n8fJXAu9qI4NctZft1wDHt3FP/3SofGN7Lbt+SNL0PgD8NwYPCe5k8ND4OVX1jdkeqKq+CXwSeKi9N79mun2mOd6XgPMZ9Jd+Cvh/GHT9ADgXOAS4v7X7ZgZdE6W9StW+fJMi7R/tbvMHgLcu1rvASf4u8E3g1VX1/YVujyRJWng+qKhFpar+MMluBsPtLbqkOsmLGDwYeYMJtSRJmuSdammG2sQDjzN4uGZ1VfkUuCRJAkyqJUmSpM58UFGSJEnqaFH3qT7qqKNq2bJlC90MAJ5++mkOPfTQhW7GfmFso6evccHijO2ee+75blVNO8On9s2+vNcvxr+TudLn2KDf8fU5Nuh/fF3f6xd1Ur1s2TK+/OUvL3QzAJiYmGB8fHyhm7FfGNvo6WtcsDhjS7Ivk1RohvblvX4x/p3MlT7HBv2Or8+xQf/j6/peb/cPSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpo0U9o6I0F5Zd9Jlp62y57B3z0BJJc+XebU9x3jTXtte1pPnknWpJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKmjGSXVSX4lyX1JvpHkk0lemmR5kruSbE5yY5JDWt2XtPXNbfuyoeNc3Mq/leS0/ROSJEmSNL+mTaqTLAV+GVhVVW8ADgLOBj4MXFFVxwE7gQvaLhcAO1v5Fa0eSY5v+70eWA18LMlBcxuOJEmSNP9m2v3jYOBlSQ4GXg5sB94G3Ny2bwTObMtr2jpt+ylJ0spvqKpnq+phYDNwYvcQJEmSpIU17YyKVbUtyUeAbwP/Hfg8cA/wZFXtbtW2Akvb8lLg0bbv7iRPAUe28juHDj28z48kWQesAxgbG2NiYmL2Ue0Hu3btWjRtmWt9j239yuemrTdq8ff9nPU1NklSf02bVCc5nMFd5uXAk8AfM+i+sV9U1QZgA8CqVatqfHx8f73UrExMTLBY2jLX+h7b5Xc8PW29LeeM7//GzKG+n7O+xiZJ6q+ZdP94O/BwVX2nqn4IfAp4K7CkdQcBOAbY1pa3AccCtO2HAd8bLt/LPpIkSdLImklS/W3g5CQvb32jTwHuB74IvKvVWQvc0pZvbeu07V+oqmrlZ7fRQZYDK4AvzU0YkiRJ0sKZSZ/qu5LcDHwF2A18lUH3jM8ANyT5UCu7pu1yDfCJJJuBHQxG/KCq7ktyE4OEfDdwYVVN39lVkiRJWuSmTaoBquoS4JI9ih9iL6N3VNUPgHdPcZxLgUtn2UZJkiRpUXNGRUmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSRJJfiXJfUm+keSTSV6aZHmSu5JsTnJjkkNa3Ze09c1t+7Kh41zcyr+V5LSFikeS5ptJtSQd4JIsBX4ZWFVVbwAOYjDHwIeBK6rqOGAncEHb5QJgZyu/otUjyfFtv9cDq4GPJTloPmORpIViUi1JgsG8BS9LcjDwcmA78Dbg5rZ9I3BmW17T1mnbT2kz7q4BbqiqZ6vqYWAze5nPQJL6aEaTv0iS+quqtiX5CPBt4L8DnwfuAZ6sqt2t2lZgaVteCjza9t2d5CngyFZ+59Chh/f5MUnWAesAxsbGmJiYmFWbx14G61fufsE6sz3mYrFr166RbftM9Dm+PscG/Y+vK5NqSTrAJTmcwV3m5cCTwB8z6L6x31TVBmADwKpVq2p8fHxW+3/0+lu4/N4X/gjbcs7sjrlYTExMMNt/j1HS5/j6HBv0P76u7P4hSXo78HBVfaeqfgh8CngrsKR1BwE4BtjWlrcBxwK07YcB3xsu38s+ktRrJtWSpG8DJyd5eesbfQpwP/BF4F2tzlrglrZ8a1unbf9CVVUrP7uNDrIcWAF8aZ5ikKQFZfcPSTrAVdVdSW4GvgLsBr7KoGvGZ4AbknyolV3TdrkG+ESSzcAOBiN+UFX3JbmJQUK+G7iwqp6b12AkaYGYVEuSqKpLgEv2KH6IvYzeUVU/AN49xXEuBS6d8wZK0iJn9w9JkiSpo2mT6iSvTfK1oZ/vJ3l/kiOSbEryYPt9eKufJFe1GbW+nuSEoWOtbfUfTLJ26leVJEmSRse0SXVVfauq3lRVbwL+IfAM8GngIuD2qloB3N7WAU5n8HDKCgZjkF4NkOQIBl8tnsTg68RLJhNxSZIkaZTNtvvHKcBfVdUj/PiMWnvOtHVdDdzJYEimo4HTgE1VtaOqdgKb2M/joEqSJEnzYbZJ9dnAJ9vyWFVtb8uPAWNt+UczbTWTM2pNVS5JkiSNtBmP/pHkEODngYv33FZVlaTmokFdp67dX/o8NWffY1u/cvoRvUYt/r6fs77GJknqr9kMqXc68JWqerytP57k6Kra3rp3PNHKp5pRaxswvkf5xJ4v0nXq2v2lz1Nz9j22y+94etp6ozadcd/PWV9jkyT112y6f/wCz3f9gB+fUWvPmbbObaOAnAw81bqJfA44Ncnh7QHFU1uZJEmSNNJmdKc6yaHAzwH/21DxZcBNSS4AHgHOauW3AWcAmxmMFHI+QFXtSPJB4O5W7wNVtaNzBJIkSdICm1FSXVVPA0fuUfY9BqOB7Fm3gAunOM61wLWzb6YkSZK0eDmjoiRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JFJtSRJktTRjJLqJEuS3Jzkm0keSPKWJEck2ZTkwfb78FY3Sa5KsjnJ15OcMHScta3+g0nW7q+gJEmSpPk00zvVVwKfrarXAW8EHgAuAm6vqhXA7W0d4HRgRftZB1wNkOQI4BLgJOBE4JLJRFySJEkaZdMm1UkOA34WuAagqv62qp4E1gAbW7WNwJlteQ1wXQ3cCSxJcjRwGrCpqnZU1U5gE7B6TqORJEmSFsDBM6izHPgO8IdJ3gjcA7wPGKuq7a3OY8BYW14KPDq0/9ZWNlX5j0myjsEdbsbGxpiYmJhpLPvVrl27Fk1b5lrfY1u/8rlp641a/H0/Z32NTZLUXzNJqg8GTgDeW1V3JbmS57t6AFBVlaTmokFVtQHYALBq1aoaHx+fi8N2NjExwWJpy1zre2yX3/H0tPW2nDO+/xszh/p+zvoamySpv2bSp3orsLWq7mrrNzNIsh9v3Tpov59o27cBxw7tf0wrm6pckiRJGmnTJtVV9RjwaJLXtqJTgPuBW4HJETzWAre05VuBc9soICcDT7VuIp8DTk1yeHtA8dRWJkmSJI20mXT/AHgvcH2SQ4CHgPMZJOQ3JbkAeAQ4q9W9DTgD2Aw80+pSVTuSfBC4u9X7QFXtmJMoJEmSpAU0o6S6qr4GrNrLplP2UreAC6c4zrXAtbNpoCRJkrTYOaOiJEmS1JFJtSRJktSRSbUkiSRLktyc5JtJHkjyliRHJNmU5MH2+/BWN0muSrI5ydeTnDB0nLWt/oNJ1k79ipLULybVkiSAK4HPVtXrgDcCDzCYk+D2qloB3M7zcxScDqxoP+uAqwGSHAFcApwEnAhcMpmIS1LfmVRL0gEuyWHAzwLXAFTV31bVk8AaYGOrthE4sy2vAa6rgTuBJW2+gtOATVW1o6p2ApuA1fMYiiQtmJkOqSdJ6q/lwHeAP0zyRuAe4H3AWJtnAOAxYKwtLwUeHdp/ayubqvwnJFnH4C43Y2Njs56afuxlsH7l7hesM6rT3e/atWtk2z4TfY6vz7FB/+PryqRaknQwg5ly31tVdyW5kue7egCD4VKT1Fy9YFVtADYArFq1qmY7Nf1Hr7+Fy+994Y+wLefM7piLxcTEBLP99xglfY6vz7FB/+Pryu4fkqStwNaququt38wgyX68deug/X6ibd8GHDu0/zGtbKpySeo9k2pJOsBV1WPAo0le24pOAe4HbgUmR/BYC9zSlm8Fzm2jgJwMPNW6iXwOODXJ4e0BxVNbmST1nt0/JEkA7wWuT3II8BBwPoMbLzcluQB4BDir1b0NOAPYDDzT6lJVO5J8ELi71ftAVe2YvxAkaeGYVEuSqKqvAav2sumUvdQt4MIpjnMtcO3ctk6SFj+7f0iSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHc0oqU6yJcm9Sb6W5Mut7Igkm5I82H4f3sqT5Kokm5N8PckJQ8dZ2+o/mGTtVK8nSZIkjZLZ3Kn+J1X1pqqaHHLpIuD2qloB3M7zU9qeDqxoP+uAq2GQhAOXACcBJwKXTCbikiRJ0ijr0v1jDbCxLW8Ezhwqv64G7gSWtOltTwM2VdWOqtoJbAJWd3h9SZIkaVGY6eQvBXw+SQH/oao2AGNtWlqAx4CxtrwUeHRo362tbKryH5NkHYM73IyNjTExMTHDJu5fu3btWjRtmWt9j239yuemrTdq8ff9nPU1NklSf800qf7HVbUtyd8BNiX55vDGqqqWcHfWEvYNAKtWrarx8fG5OGxnExMTLJa2zLW+x3b5HU9PW2/LOeP7vzFzqO/nrK+xSZL6a0bdP6pqW/v9BPBpBn2iH2/dOmi/n2jVtwHHDu1+TCubqlySJEkaadMm1UkOTfLKyWXgVOAbwK3A5Agea4Fb2vKtwLltFJCTgadaN5HPAacmObw9oHhqK5MkSZJG2ky6f4wBn04yWf8/VdVnk9wN3JTkAuAR4KxW/zbgDGAz8AxwPkBV7UjyQeDuVu8DVbVjziKRJEmSFsi0SXVVPQS8cS/l3wNO2Ut5ARdOcaxrgWtn30xJkiRp8XJGRUmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqSOTakmSJKkjk2pJkiSpI5NqSZIkqaMZJ9VJDkry1SR/1taXJ7kryeYkNyY5pJW/pK1vbtuXDR3j4lb+rSSnzXUwkiRJ0kKYzZ3q9wEPDK1/GLiiqo4DdgIXtPILgJ2t/IpWjyTHA2cDrwdWAx9LclC35kuSJEkLb0ZJdZJjgHcAf9DWA7wNuLlV2Qic2ZbXtHXa9lNa/TXADVX1bFU9DGwGTpyLICRJkqSFdPAM6/0u8OvAK9v6kcCTVbW7rW8FlrblpcCjAFW1O8lTrf5S4M6hYw7v8yNJ1gHrAMbGxpiYmJhpLPvVrl27Fk1b5lrfY1u/8rlp641a/H0/Z32NTZLUX9Mm1UneCTxRVfckGd/fDaqqDcAGgFWrVtX4+H5/yRmZmJhgsbRlrvU9tsvveHraelvOGd//jZlDfT9nfY1NktRfM7lT/Vbg55OcAbwUeBVwJbAkycHtbvUxwLZWfxtwLLA1ycHAYcD3hsonDe8jSZIkjaxp+1RX1cVVdUxVLWPwoOEXquoc4IvAu1q1tcAtbfnWtk7b/oWqqlZ+dhsdZDmwAvjSnEUiSZIkLZAu41T/BvCrSTYz6DN9TSu/Bjiylf8qcBFAVd0H3ATcD3wWuLCqpu/sKkmaFw6dKkn7bqYPKgJQVRPARFt+iL2M3lFVPwDePcX+lwKXzraRkqR5MTl06qva+uTQqTck+fcMhky9mqGhU5Oc3er98z2GTn0N8J+T/H1voEg6EDijoiTJoVMlqaNZ3amWJPXWvA2dCt2HTx17GaxfufsF64zq0Ix9H1ayz/H1OTbof3xdmVRL0gFuvodOhe7Dp370+lu4/N4X/ggbtaEyJ/V9WMk+x9fn2KD/8XVlUi1JcuhUSerIPtWSdIBz6FRJ6s471ZKkqfwGcEOSDwFf5ceHTv1EGzp1B4NEnKq6L8nk0Km7cehUSQcQk2pJ0o84dKok7Ru7f0iSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHTn6hySpl5Zd9Jlp62y57B3z0BJJBwLvVEuSJEkdmVRLkiRJHZlUS5IkSR1Nm1QneWmSLyX5yyT3JfntVr48yV1JNie5MckhrfwlbX1z275s6FgXt/JvJTltfwUlSZIkzaeZ3Kl+FnhbVb0ReBOwOsnJwIeBK6rqOGAncEGrfwGws5Vf0eqR5HjgbOD1wGrgY0kOmstgJEmSpIUwbVJdA7va6ovbTwFvA25u5RuBM9vymrZO235KkrTyG6rq2ap6GNgMnDgnUUiSJEkLaEZD6rU7yvcAxwG/D/wV8GRV7W5VtgJL2/JS4FGAqtqd5CngyFZ+59Bhh/cZfq11wDqAsbExJiYmZhfRfrJr165F05a51vfY1q98btp6oxZ/389ZX2OTJPXXjJLqqnoOeFOSJcCngdftrwZV1QZgA8CqVatqfHx8f73UrExMTLBY2jLX+h7b5Xc8PW29LeeM7//GzKG+n7O+xiZJ6q9Zjf5RVU8CXwTeAixJMpmUHwNsa8vbgGMB2vbDgO8Nl+9lH0mSJGlkzWT0j59qd6hJ8jLg54AHGCTX72rV1gK3tOVb2zpt+xeqqlr52W10kOXACuBLcxWIJEmStFBm0v3jaGBj61f9IuCmqvqzJPcDNyT5EPBV4JpW/xrgE0k2AzsYjPhBVd2X5CbgfmA3cGHrViJJkiSNtGmT6qr6OvDmvZQ/xF5G76iqHwDvnuJYlwKXzr6ZkiRJ0uLljIqSJElSRybVkiRJUkcm1ZIkSVJHJtWSJElSRybVkiRJUkcm1ZIkSVJHM5qmXOq7ZRd9Zto6Wy57xzy0RJIkjSLvVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHZlUS5IkSR2ZVEuSJEkdmVRLkiRJHU2bVCc5NskXk9yf5L4k72vlRyTZlOTB9vvwVp4kVyXZnOTrSU4YOtbaVv/BJGv3X1iSJEnS/JnJnerdwPqqOh44GbgwyfHARcDtVbUCuL2tA5wOrGg/64CrYZCEA5cAJwEnApdMJuKSJEnSKJs2qa6q7VX1lbb8N8ADwFJgDbCxVdsInNmW1wDX1cCdwJIkRwOnAZuqakdV7QQ2AavnNBpJkiRpAcyqT3WSZcCbgbuAsara3jY9Boy15aXAo0O7bW1lU5VLkiRJI+3gmVZM8grgT4D3V9X3k/xoW1VVkpqLBiVZx6DbCGNjY0xMTMzFYTvbtWvXomnLXOt7bOtXPjcnx1pM/0Z9P2d9jW2xSnIscB2DmyMFbKiqK1u3vRuBZcAW4Kyq2pnBB8CVwBnAM8B5k99otudl/k079IeqaiOSdACYUVKd5MUMEurrq+pTrfjxJEdX1fbWveOJVr4NOHZo92Na2TZgfI/yiT1fq6o2ABsAVq1aVePj43tWWRATExMslrbMtb7HdvkdT8/JsbacMz4nx5kLfT9nfY1tEZt8duYrSV4J3JNkE3Aeg2dnLktyEYNnZ36DH3925iQGz86cNPTszCoGyfk9SW5tXf4kqddmMvpHgGuAB6rq3w1tuhWYHMFjLXDLUPm5bRSQk4GnWjeRzwGnJjm8PaB4aiuTJC0gn52RpO5mcqf6rcC/BO5N8rVW9pvAZcBNSS4AHgHOattuY/CV4GYGXwueD1BVO5J8ELi71ftAVe2YkygkSXNivp6d6drVb+xlsH7l7lntszeLsatR37tA9Tm+PscG/Y+vq2mT6qq6A8gUm0/ZS/0CLpziWNcC186mgZKk+TFfz86043Xq6vfR62/h8ntn/FjQlBZTt65Jfe8C1ef4+hwb9D++rpxRUZL0gs/OtO0zfXZmb+WS1Hsm1ZJ0gPPZGUnqrvt3Z5KkUeezM5LUkUm1JB3gfHZGkrqz+4ckSZLUkUm1JEmS1JFJtSRJktSRSbUkSZLUkUm1JEmS1JGjf0hzZNlFn5m2zpbL3jEPLZEkSfPNO9WSJElSRybVkiRJUkcm1ZIkSVJHJtWSJElSRybVkiRJUkcm1ZIkSVJH0ybVSa5N8kSSbwyVHZFkU5IH2+/DW3mSXJVkc5KvJzlhaJ+1rf6DSdbun3AkSZKk+TeTO9UfB1bvUXYRcHtVrQBub+sApwMr2s864GoYJOHAJcBJwInAJZOJuCRJkjTqpk2qq+ovgB17FK8BNrbljcCZQ+XX1cCdwJIkRwOnAZuqakdV7QQ28ZOJuiRJkjSS9nVGxbGq2t6WHwPG2vJS4NGheltb2VTlPyHJOgZ3uRkbG2NiYmIfmzi3du3atWjaMtf6Htv6lc/NybGm+zdav3J352PMVN/PWV9j0+Iz3UyozoIqaaY6T1NeVZWk5qIx7XgbgA0Aq1atqvHx8bk6dCcTExMslrbMtb7HdvkdT8/JsbacM/6C28+byTTl0xxjpvp+zvoamySpv/Z19I/HW7cO2u8nWvk24Nihese0sqnKJUmSpJG3r0n1rcDkCB5rgVuGys9to4CcDDzVuol8Djg1yeHtAcVTW5kkSZI08qbt/pHkk8A4cFSSrQxG8bgMuCnJBcAjwFmt+m3AGcBm4BngfICq2pHkg8Ddrd4HqmrPhx+lRW26vpeSJOnANW1SXVW/MMWmU/ZSt4ALpzjOtcC1s2qdJEmSNAKcUVGSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqqPPkL5IWJ2eKkyRp/ninWpIkSerIpFqSJEnqyO4fkiRNYSaTPtmVShJ4p1qSJEnqzKRakiRJ6sikWpIkSerIpFqSJEnqyKRakiRJ6sikWpIkSerIpFqSJEnqyHGqJfWGYwpLkhbKvCfVSVYDVwIHAX9QVZfNdxvUH9MlUetX7mYx/d9xrpK+e7c9xXkzONZ8MJHVng609/mZXAMz4XUijbZ5zTaSHAT8PvBzwFbg7iS3VtX989kOaTGbyQf0+pXz0BDmLlnQgcP3eUkHqvm+hXcisLmqHgJIcgOwBvDN9gBjsqY9Tf5NrF+5e8q78N7JGwm+z++jmVwDMzHddeK3S9L+kaqavxdL3gWsrqpfauv/Ejipqt4zVGcdsK6tvhb41rw18IUdBXx3oRuxnxjb6OlrXLA4Y/vpqvqphW7EKJjJ+3wr7/pevxj/TuZKn2ODfsfX59ig//G9tqpeua87L57Opk1VbQA2LHQ79pTky1W1aqHbsT8Y2+jpa1zQ79j0vK7v9X3+O+lzbNDv+PocGxwY8XXZf76H1NsGHDu0fkwrkyT1g+/zkg5I851U3w2sSLI8ySHA2cCt89wGSdL+4/u8pAPSvHb/qKrdSd4DfI7BUEvXVtV989mGDhZdl7qlnqAAAAVLSURBVJQ5ZGyjp69xQb9j6715fJ/v899Jn2ODfsfX59jA+F7QvD6oKEmSJPWR05RLkiRJHZlUS5IkSR2ZVM9QkvVJKslRbT1JrkqyOcnXk5yw0G2crST/Nsk3W/s/nWTJ0LaLW2zfSnLaQrZzXyRZ3dq+OclFC92eLpIcm+SLSe5Pcl+S97XyI5JsSvJg+334Qrd1XyQ5KMlXk/xZW1+e5K527m5sD7tJQL+ubej/9Q39vsaTLElyc/ssfSDJW/py7pL8Svub/EaSTyZ56SifuyTXJnkiyTeGyvZ6rvY1xzOpnoEkxwKnAt8eKj4dWNF+1gFXL0DTutoEvKGq/gHw/wEXAyQ5nsET+68HVgMfy2Dq4ZGQ56dJPh04HviFFtOo2g2sr6rjgZOBC1s8FwG3V9UK4Pa2PoreBzwwtP5h4IqqOg7YCVywIK3SotPDaxv6f31Dv6/xK4HPVtXrgDcyiHPkz12SpcAvA6uq6g0MHjo+m9E+dx9nkNMMm+pc7VOOZ1I9M1cAvw4MP9W5BriuBu4EliQ5ekFat4+q6vNVtbut3slgPFkYxHZDVT1bVQ8DmxlMPTwqfjRNclX9LTA5TfJIqqrtVfWVtvw3DN60lzKIaWOrthE4c2FauO+SHAO8A/iDth7gbcDNrcpIxqX9plfXNvT7+oZ+X+NJDgN+FrgGoKr+tqqepCfnjsEIcS9LcjDwcmA7I3zuquovgB17FE91rvYpxzOpnkaSNcC2qvrLPTYtBR4dWt/aykbV/wr8eVse9dhGvf1TSrIMeDNwFzBWVdvbpseAsQVqVhe/y+A/rP+jrR8JPDn0n73enDvNid5e29DL6xv6fY0vB74D/GHr3vIHSQ6lB+euqrYBH2HwDf124CngHvpz7iZNda726b3GpBpI8p9bn6E9f9YAvwn8nwvdxn01TWyTdX6LwVeQ1y9cSzWdJK8A/gR4f1V9f3hbDcbGHKnxMZO8E3iiqu5Z6LZIC61v1zccENf4wcAJwNVV9Wbgafbo6jHC5+5wBndrlwOvAQ7lJ7tO9MpcnKt5nfxlsaqqt++tPMlKBn9Qfzn4xopjgK8kOZERmYp3qtgmJTkPeCdwSj0/aPlIxPYCRr39PyHJixl84F5fVZ9qxY8nObqqtrevpZ5YuBbuk7cCP5/kDOClwKsY9E9ckuTgdjdk5M+d5lTvrm3o7fUN/b/GtwJbq+qutn4zg6S6D+fu7cDDVfUdgCSfYnA++3LuJk11rvbpvcY71S+gqu6tqr9TVcuqahmDC+iEqnqMwbS757YnRE8Gnhr6CmEkJFnN4Gu5n6+qZ4Y23QqcneQlSZYz6Kj/pYVo4z7q1TTJrQ/iNcADVfXvhjbdCqxty2uBW+a7bV1U1cVVdUy7ts4GvlBV5wBfBN7Vqo1cXNqvenVtQ3+vb+j/Nd5ygUeTvLYVnQLcTw/OHYNuHycneXn7G52MrRfnbshU52qfcjxnVJyFJFsYPAn73fZH9nsMvg55Bji/qr68kO2brSSbgZcA32tFd1bVv2rbfotBP+vdDL6O/PO9H2VxandGfpfnp0m+dIGbtM+S/GPgvwD38ny/xN9k0O/yJuDvAo8AZ1XVng9hjIQk48C/rqp3JvkZBg+gHQF8FfjFqnp2IdunxaNP1zYcGNc39PcaT/ImBg9hHgI8BJzP4IblyJ+7JL8N/HMGecBXgV9i0K94JM9dkk8C48BRwOPAJcCfspdzta85nkm1JEmS1JHdPyRJkqSOTKolSZKkjkyqJUmSpI5MqiVJkqSOTKolSZKkjkyqJUmSpI5MqiVJkqSO/n9OJWhO/k3EZwAAAABJRU5ErkJggg==\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAs4AAAEICAYAAABPtXIYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAeNUlEQVR4nO3df5Qd91nf8fcndpwEq5FJDIojuciJjKmxaEP22ElpqVwSkLEVAycNdkzBqbFOODWFVoUqQMuPklND40KMDRwRGyVgrLgmTaRYIaGUbQgEcMyP2vGPIhwllnGsGBLBKjSJzNM/7iy5bLTauXvv3bt39v06x8d7Z+6deR7Nndlnn/nOTKoKSZIkSaf2jEkHIEmSJE0DC2dJkiSpBQtnSZIkqQULZ0mSJKkFC2dJkiSpBQtnSZIkqQULZ61JSeaSvGjScUiSIMnPJ/mPq3V50jwLZ62YJNcmuT/Jp5N8PMnPJlm/AuudTfKd/dOqal1VPdrM35vkx8cdhyR1VZJKsmXBtB9J8sttPl9Vr6+q/9x8bluSIydZ1ueapsenkvxOkpe3WZ40ShbOWhFJdgE/AXwfsB54GbAZeF+SZ04wNEnSdHh7Va0DvgT4APCOJFn4piSnrXhkWjMsnDV2SZ4L/Cjw3VX1a1X1uao6DLwGeBHw2oVd34UdhyS7k/xpkr9K8mCSb+6bd22SDyR5U5JPJvlIksuaeW8E/ilwS9OpuKWZXkm2JNkJXAN8fzP/QJLvS/KrC3K4Ocmbx/VvJEldNn9MT7IrydEkTyR5Xd/8vUl+PMmZwHuAFzbH5LkkL+xfVlV9Dngr8ALg+c1nfy7JwSTHgUtP8jvlyiR/lOQvm98l25vp65Pc1sTzeBODhbcWZeGslfCPgWcD7+ifWFVzwEHg61ss40/pFcDr6RXhv5zknL75lwCPAGcDPwncliRV9YPAbwE3NMMzblgQwx7gDuAnm/k7gF8Gtic5CyDJ6cBVwNsGS1uS1OcF9I7hG4HrgFuTfHH/G6rqOHAZ8GfNMXldVf1Z/3uSPAu4Fnisqp5qJr8WeCPw9+h1o/vffzG94/f3AWcBXwscbmbvBU4AW4CX0Pt99HeG9kn9LJy1Es4GnqqqEyeZ9wS9026nVFX/var+rKr+pqreDvwJcHHfWz5aVb9QVU/T60ScA2xYTrBV9QTwfuBfNJO2N/Hft5zlSZIA+BzwY81Zx4PAHHDBAJ9/TZJPAY8BLwW+uW/eu6rqt5vfEf9vweeuA26vql9v5j9eVQ8n2QB8I/C9VXW8qo4CP0WvUSKd1OmTDkBrwlPA2UlOP0nxfE4z/5SSfDvw7+iNiwZYR68gn/fx+R+q6tPNsLd1Q8T8VuC7gF8Avg34pSGWJUld9zSw8HqVZ9Irluf9+YLfAZ9msOP0XVX1bYvMe+wUnzuX3tnNhb6sifGJvqHSz1hiWVrj7DhrJXwQ+AzwLf0Tk6yjd0puFjgOfFHf7Bf0ve/L6BWwNwDPr6qzgAeAL7goZBG1jPnvBL4qyUXAFfSGc0iSTu5jfL6xMe884KPLWNZSx+xBP/MY8OJFpn8GOLuqzmr+e25VfeUy1q81wsJZY1dVx+iNS/6ZJNuTPDPJZuAuet3mO4A/Ar4xyfOSvAD43r5FnEnvoPgJgOaCkosGCOFJehchtp7fnOq7G/gV4Per6mMDrE+S1pq3Az+UZFOSZyR5BbCD3nF0UE/Su+hvVLcrvQ14XZKva2LbmOQrmmF57wNuSvLcZt6Lk/yzEa1XHWThrBVRVT8J/ADwJuCvgI/Q6zC/orkY5JeAP6Z3wcb76B2E5z/7IHATvc71k8BW4LcHWP2bgVc3d9y4+STzbwMubO4N+s6+6W9t1uUwDUk6tR8DfofehXmfpHeR9jVV9cCgC6qqh4E7gUeb4/ILl/rMEsv7feB19MYvHwP+N71hGgDfDpwBPNjEfTe9IYTSSaVqOWdEpOE0XeMfA75mtXZzk/x94GHgBVX1l5OOR5IkTZYXB2oiquoXk5ygd6u6VVc4J3kGvYsR91k0S5IksOMsfYHmBvxP0ruoZXtVeYW1JEmycJYkSZLa8OJASZIkqYVVMcb57LPPrs2bN086DACOHz/OmWeeOekwxqKruXU1L+hubqs1r/vuu++pqlrySZZankGP9av1ezIq5je9upwbdDu/48eP8/DDDy/7WL8qCufNmzfzoQ99aNJhADA7O8u2bdsmHcZYdDW3ruYF3c1tteaVZDkPa1BLgx7rV+v3ZFTMb3p1OTfodn6zs7Nceumlyz7WO1RDkiRJamEshXOSM5N8KMkV41i+JEmStNJaFc5Jbk9yNMkDC6ZvT/JIkkNJdvfN+g/0HqcsSZIkdULbjvNeYHv/hCSnAbcClwEXAlcnuTDJK+k9uvLoCOOUJK0ynl2UtNa0ujiwqt6fZPOCyRcDh6rqUYAk+4ArgXXAmfSK6b9OcrCq/mbhMpPsBHYCbNiwgdnZ2WWmMFpzc3OrJpZR62puXc0LuptbV/OadkluB64AjlbVRX3TtwNvBk4D3lJVNzazPLsoaU0Z5q4aG4H+J6odAS6pqhsAklwLPHWyohmgqvYAewBmZmZqtVy92fUrSbuYW1fzgu7m1tW8OmAvcAvwtvkJfWcXX0nvOH9vkv30fgc8CDx75cOUpMkY2+3oqmrvuJYtSRq91XZ2setnJsxvenU5N+h2fnNzc0N9fpjC+XHg3L7Xm5pprSXZAezYsmXLEGFIksZoYmcXu35mwvymV5dzg27nN+wfBMMUzvcC5yc5j17BfBXw2kEWUFUHgAMzMzPXDxGHBMDm3feccv7hGy9foUiktaPN2cVhmiRL7dfgvi1p5bS9Hd2dwAeBC5IcSXJdVZ0AbgDeCzwE3FVVHx5k5Ul2JNlz7NixQeOWJK2Moc8uVtWBqtq5fv36kQYmSSut7V01rl5k+kHg4HJXbsdZkla9oc8uSlJX+MhtSRLg2UVJWsrY7qrRhhcHStLq4dlFSTq1iXacHfcmSZKkaeFQDUnSWDlUQ1JXWDhLksbKs4uSumKihbNdCEmSJE0LxzhLksbKJomkrnCohiRprGySSOoKC2dJkiSpBcc4S5IkSS04xlmSNFY2SSR1hUM1JEljZZNEUldYOEuSJEktWDhLkiRJLXhxoCRJktSCFwdKksbKJomkrnCohiRprGySSOoKC2dJkiSphdMnHYAkScPYvPueU84/fOPlKxSJpK6z4yxJkiS14F01JEmSpBa8q4YkaaxskkjqCodqSJLGyiaJpK6wcJYkSZJa8K4aWjOWuvIevPpekiQtzo6zJEmS1IKFsyRJktSChbMkSZLUgoWzJEmS1IIPQJEkSZJa8AEokqSxskkiqSscqiFJGiubJJK6wsJZkiRJasHCWZIkSWrBwlmSJElqwcJZkiRJauH0SQcgTZvNu+9Z8j2Hb7x8BSKRJEkryY6zJEmS1IKFsyRJktSChbMkSZLUgoWzJEmS1MLIC+ck/yDJzye5O8l3jXr5kiRJ0iS0KpyT3J7kaJIHFkzfnuSRJIeS7Aaoqoeq6vXAa4CvGX3IkqRJs0kiaS1q23HeC2zvn5DkNOBW4DLgQuDqJBc2814F3AMcHFmkkqSxskkiSafW6j7OVfX+JJsXTL4YOFRVjwIk2QdcCTxYVfuB/UnuAX7lZMtMshPYCbBhwwZmZ2eXE//Izc3NrZpYRq2ruc3ntWvriaGX1ebfp816RvXv3PVtplVnL3AL8Lb5CX1NklcCR4B7k+yvqgebJsl3Ab80gVhb897rkkZlmAegbAQe63t9BLgkyTbgW4BncYqOc1XtAfYAzMzM1LZt24YIZXRmZ2dZLbGMWldzm8/r2ha/HJdy+JptS76nzXraLKeNrm8zrS6rrUkyNzfHrq1PD5jF8kziD7mu/wHZ5fy6nBt0O7+5ubmhPj/yJwdW1Sww2+a9SXYAO7Zs2TLqMCRJozGxJsns7Cw3feD44BEvw6j+2B1E1/+A7HJ+Xc4Nup3fsH8QDFM4Pw6c2/d6UzOttao6AByYmZm5fog4JEkrzCaJpLVomML5XuD8JOfRK5ivAl47kqikCWkzFlJaY2ySSFKj7e3o7gQ+CFyQ5EiS66rqBHAD8F7gIeCuqvrwICtPsiPJnmPHjg0atyRpZfxtkyTJGfSaJPsnHJMkTUSrwrmqrq6qc6rqmVW1qapua6YfrKovr6oXV9UbB115VR2oqp3r168f9KOSpBGzSSJJpzbyiwMlSdOpqq5eZPpBhrgvv0M1JHXFyB+5PQi7EJIkSZoWEy2cHaohSd1nk0RSV0y0cJYkdZ9NEkld4VANSZIkqYWJXhzoBSPScNrcd/rwjZevQCTS4nwAiqSucKiGJGmsHKohqSssnCVJkqQWJjpUw9N3kqTVwGFPktrwdnSSpLHyQnBJXeFQDUnSWNkkkdQVFs6SJElSCxbOkiRJUgs+AEWSJElqwYsDJUljZZNEUlc4VEOSNFY2SSR1hYWzJEmS1IKFsyRJktSChbMkSZLUgnfVkCRJklrwrhqSpLGySSKpKxyqIUkaK5skkrrCwlmSJElqwcJZkiRJasHCWZIkSWrBwlmSJElqwcJZkiRJasHCWZIkSWrBB6BIkiRJLfgAFEnSWNkkkdQVDtWQJI2VTRJJXWHhLEmSJLVg4SxJkiS1YOEsSZIktXD6pAOQpEFt3n3Pku85fOPlKxCJJGktseMsSZIktWDhLEmSJLVg4SxJkiS14BhnSZJGxPH3UrdZOEtqxYJAkrTWjaVwTvJNwOXAc4Hbqup941iPJEmStFJaj3FOcnuSo0keWDB9e5JHkhxKshugqt5ZVdcDrwe+dbQhS5ImLck3JfmFJG9P8vWTjkeSVsIgHee9wC3A2+YnJDkNuBV4JXAEuDfJ/qp6sHnLDzXzJUmrXJLbgSuAo1V1Ud/07cCbgdOAt1TVjVX1TuCdSb4YeBPQ+TOLbYYrSeq21oVzVb0/yeYFky8GDlXVowBJ9gFXJnkIuBF4T1X9wcmWl2QnsBNgw4YNzM7ODhz8OMzNza2aWEatq7nN57Vr64lJh/K3fuaOdy35nq0b1y/5nqW2WZucR7XNR7muYb+LK5n3GrMXGySStKhhxzhvBB7re30EuAT4buAVwPokW6rq5xd+sKr2AHsAZmZmatu2bUOGMhqzs7OsllhGbZpzO1WnZ9fWp7npA8eZtmtdD1+zbcn3/Mwd72pyW8zSObdZTxvXtrk4sOW6hv0ujjIWfd6oGyTN+5fdJJmbm2PX1qcHSWEqzP8bdLWZMa/L+XU5N+h2fnNzc0N9fiyVRlXdDNy81PuS7AB2bNmyZRxhSJKGt+wGCQzXJJmdnV3iD8fpNP9H3TQ3M9rocn5dzg26nd+wfxAM+wCUx4Fz+15vaqa1UlUHqmrn+vVLn7KWJK0eVXVzVb20ql6/WNE8L8mOJHuOHTu2UuFJ0lgMWzjfC5yf5LwkZwBXAfuHD0uStEoM1SABmySSumOQ29HdCXwQuCDJkSTXVdUJ4AbgvcBDwF1V9eEBlmkXQpJWNxskktRoXThX1dVVdU5VPbOqNlXVbc30g1X15VX14qp64yArtwshSavHOBokzXJtkkjqhOm6DYEkaWyq6upFph8EDg6x3APAgZmZmeuXuwxJWg2GHeM8FLsQkiRJmhYTLZwdqiFJ3WeTRFJXOFRDmpA2j+/dtXUFApHGzKEakrpiooWzD0CRVoc2RbwkSWudQzUkSWPlUA1JXTHRwlmS1H02SSR1hYWzJEmS1IK3o5MkSZJacIyzJGmsbJJI6gqHakiSxsomiaSu8D7OkiStoPnbP+7aeoJrF7kV5OEbL1/JkCS1ZMdZkiRJasGLAyVJkqQWvDhQkjRWNkkkdYVDNSRJY2WTRFJXWDhLkiRJLXhXDUkrxrsJSJKmmR1nSZIkqQULZ0mSJKmFiQ7VSLID2LFly5ZJhqEJ27zIKXtJ3eCxXlJXTLRwrqoDwIGZmZnrJxmHJGl8PNYPbqmGgtcCSJPhUA1JkiSpBQtnSZIkqQULZ0mSJKkFC2dJkiSpBQtnSZIkqQULZ0mSJKmFiRbOSXYk2XPs2LFJhiFJkiQtaaKFc1UdqKqd69evn2QYkqQxskkiqSscqiFJGiubJJK6wsJZkiRJamGij9xW9y312FhJkqRpYcdZkiRJasHCWZIkSWrBoRqSJHVQm6Fyh2+8fAUikbrDjrMkSZLUgoWzJEmS1IJDNSRJWqMcziENxo6zJEmS1MLIC+ckL0pyW5K7R71sSZIkaVJaFc5Jbk9yNMkDC6ZvT/JIkkNJdgNU1aNVdd04gpUkrQ42SSStRW07znuB7f0TkpwG3ApcBlwIXJ3kwpFGJ0laMTZJJOnUWhXOVfV+4C8WTL4YONQcPD8L7AOuHHF8kqSVsxebJJK0qFRVuzcmm4F3V9VFzetXA9ur6jub1/8SuAT4YeCNwCuBt1TVf1lkeTuBnQAbNmx46b59+4ZKZFTm5uZYt27dpMMYi0nkdv/jx8a+jg3PgSf/euyrmYhpy23rxvWnnD//fThVXksto385w8RyMpdeeul9VTUz8Ac75CTH+pcDP1JV39C8fgPA/LE9yd1V9epTLG/Zx/q5uTk+cuzp5SUyBYbZv0e1n4xqXSfj79Pp1eX85ubm2LFjx7KP9SO/HV1V/Tnw+hbv2wPsAZiZmalt27aNOpRlmZ2dZbXEMmqTyO3aFrc6GtaurSe46f5u3llx2nI7fM22U86f/z6cKq+lltG/nGFiUWsbgcf6Xh8BLknyfHpNkpckecNiTZJhjvWzs7Pc9IHjy4171Rtm/x7VfjKqdZ2Mv0+nV5fzm52dHerzw/xGfhw4t+/1pmZaa0l2ADu2bNkyRBialDb3/5TUTW2bJOCxXlJ3DHM7unuB85Ocl+QM4Cpg/yALqKoDVbVz/frlnQaSJI3d0E0Sj/WSuqLt7ejuBD4IXJDkSJLrquoEcAPwXuAh4K6q+vAgK0+yI8meY8fGPw5WkrQsQzdJJKkrWg3VqKqrF5l+EDi43JVX1QHgwMzMzPXLXYYkaTSaJsk24OwkR4Afrqrbksw3SU4Dbl9OkwSHakwtH8stfd70XHUkSRormySSdGojf+T2IByqIUmSpGkx0cLZC0YkqftskkjqiokWzpKk7rNJIqkrLJwlSZKkFiZ6caBXWktS93msHz0fQCVNhmOcJUlj5bFeUlc4VEOSJElqwcJZkiRJasExzlPGJzhJmjYe6yV1hWOcJUlj5bFeUlc4VEOSJElqwcJZkiRJasHCWZIkSWrBiwMlSWPlsV6j4gXymjQvDpQkjZXHekld4VANSZIkqQULZ0mSJKkFC2dJkiSpBQtnSZIkqYWpvavGSl5Z61W8Ujtt9hWtPd5VQ1JXeFcNSdJYeayX1BUO1ZAkSZJasHCWJEmSWrBwliRJklqwcJYkSZJasHCWJEmSWrBwliRJklqwcJYkSZJamNoHoEiSpoPH+u472cOPdm09wbXN9JV8SJgPLdM4+QAUSdJYeayX1BUO1ZAkSZJasHCWJEmSWrBwliRJklqwcJYkSZJasHCWJEmSWrBwliRJklqwcJYkSZJasHCWJEmSWrBwliRJklqwcJYkSZJaOH3UC0xyJvCzwGeB2aq6Y9TrkCRNlsd6SWtRq45zktuTHE3ywILp25M8kuRQkt3N5G8B7q6q64FXjTheSdKYeKyXpFNrO1RjL7C9f0KS04BbgcuAC4Grk1wIbAIea9729GjClCStgL14rJekRaWq2r0x2Qy8u6oual6/HPiRqvqG5vUbmrceAT5ZVe9Osq+qrlpkeTuBnQAbNmx46b59+wYK/P7Hjy35nq0b1w+0TIC5uTnWrVu3IutajjaxLGbDc+DJv+79PIp4h4lllPrz6pqu5naqvNp8N8e1T1566aX3VdXMwB/skNV0rJ+bm+Mjx7pbk3d1/543aH6j2vdXYl0bngNf+ryV+b0/KoP82w17jB6VpWJebp23Y8eOZR/rhxnjvJHPdxugdxC9BLgZuCXJ5cCBxT5cVXuAPQAzMzO1bdu2gVZ+7e57lnzP4WsGWybA7OwsC2MZ17qWo00si9m19QQ33d/b5KOId5hYRqk/r67pam6nyqvNd3M17ZNrwMSO9bOzs9z0gePLCHk6dHX/njdofqPa91diXbu2nuA1A9YtkzbIv92wx+hRWSrm5dZ5wxj5HltVx4HXtXlvkh3Aji1btow6DEnSGHmsl7QWDXM7useBc/teb2qmtVZVB6pq5/r103W6Q5LWEI/1ktQYpnC+Fzg/yXlJzgCuAvaPJixJ0irhsV6SGm1vR3cn8EHggiRHklxXVSeAG4D3Ag8Bd1XVhwdZeZIdSfYcO7Y6LjKTpLXMY70knVqrMc5VdfUi0w8CB5e78qo6AByYmZm5frnLkCSNhsd6STo1H7ktSZIktTDRwtnTd5LUfR7rJXXFRAtnr7SWpO7zWC+pK1o/OXCsQSSfAD466TgaZwNPTTqIMelqbl3NC7qb22rN68uq6ksmHURXLeNYv1q/J6NiftOry7lBt/M7Gzhzucf6VVE4ryZJPtTVR+52Nbeu5gXdza2reWm0uv49Mb/p1eXcoNv5DZubFwdKkiRJLVg4S5IkSS1YOH+hPZMOYIy6mltX84Lu5tbVvDRaXf+emN/06nJu0O38hsrNMc6SJElSC3acJUmSpBYsnCVJkqQWLJwXSLIrSSU5u3mdJDcnOZTk/yT56knHOIgk/zXJw03s/yPJWX3z3tDk9UiSb5hknMuVZHsT/6Ekuycdz3IlOTfJbyZ5MMmHk3xPM/15SX49yZ80///iSce6XElOS/KHSd7dvD4vye812+7tSc6YdIxaPbqyb8Pa2L+h2/t4krOS3N38Pn0oycu7sv2S/Nvme/lAkjuTPHuat12S25McTfJA37STbqvl1HgWzn2SnAt8PfCxvsmXAec3/+0Efm4CoQ3j14GLquqrgP8LvAEgyYXAVcBXAtuBn01y2sSiXIYm3lvpbaMLgaubvKbRCWBXVV0IvAz4100uu4HfqKrzgd9oXk+r7wEe6nv9E8BPVdUW4JPAdROJSqtOx/ZtWBv7N3R7H38z8GtV9RXAP6SX59RvvyQbgX8DzFTVRcBp9GqDad52e+nVNf0W21YD13gWzn/XTwHfD/RfMXkl8Lbq+V3grCTnTCS6Zaiq91XViebl7wKbmp+vBPZV1Weq6iPAIeDiScQ4hIuBQ1X1aFV9FthHL6+pU1VPVNUfND//Fb2D8kZ6+by1edtbgW+aTITDSbIJuBx4S/M6wD8H7m7eMrW5aSw6s29D9/dv6PY+nmQ98LXAbQBV9dmq+hTd2X6nA89JcjrwRcATTPG2q6r3A3+xYPJi22rgGs/CuZHkSuDxqvrjBbM2Ao/1vT7STJtG/wp4T/NzF/LqQg5fIMlm4CXA7wEbquqJZtbHgQ0TCmtYP03vj9K/aV4/H/hU3x91ndh2GplO7tvQ2f0bur2Pnwd8AvjFZijKW5KcSQe2X1U9DryJ3pn2J4BjwH10Z9vNW2xbDXysWVOFc5L/2YzhWfjflcAPAP9p0jEuxxJ5zb/nB+mdLrxjcpFqKUnWAb8KfG9V/WX/vOrdO3Lq7h+Z5ArgaFXdN+lYpEnq4v4Na2IfPx34auDnquolwHEWDMuY1u3XjPW9kt4fBy8EzuQLhzl0yrDb6vQRxrLqVdUrTjY9yVZ6X5o/7p1dYhPwB0kuBh4Hzu17+6Zm2qqxWF7zklwLXAF8XX3+xt2rPq8WupDD30ryTHq/VO+oqnc0k59Mck5VPdGcPjo6uQiX7WuAVyX5RuDZwHPpjRc8K8npTVdjqredRq5T+zZ0ev+G7u/jR4AjVfV7zeu76RXOXdh+rwA+UlWfAEjyDnrbsyvbbt5i22rgY82a6jgvpqrur6ovrarNVbWZ3k7y1VX1cWA/8O3NlZcvA471tftXvSTb6Z0+e1VVfbpv1n7gqiTPSnIevYHxvz+JGIdwL3B+c/XvGfQuaNg/4ZiWpRkPeBvwUFX9t75Z+4HvaH7+DuBdKx3bsKrqDVW1qdm3rgL+V1VdA/wm8OrmbVOZm8amM/s2dHv/hu7v400t8FiSC5pJXwc8SDe238eAlyX5ouZ7Op9bJ7Zdn8W21cA1nk8OPIkkh+ldYfpU80W6hd6pi08Dr6uqD00yvkEkOQQ8C/jzZtLvVtXrm3k/SG/c8wl6pw7fc/KlrF5Nh+On6V0JfHtVvXHCIS1Lkn8C/BZwP58fI/gD9MZB3gX8feCjwGuqauFFD1MjyTbg31fVFUleRO+ir+cBfwh8W1V9ZpLxafXoyr4Na2f/hu7u40n+Eb0LH88AHgVeR6/5OPXbL8mPAt9Krxb4Q+A76Y3zncptl+ROYBtwNvAk8MPAOznJtlpOjWfhLEmSJLXgUA1JkiSpBQtnSZIkqQULZ0mSJKkFC2dJkiSpBQtnSZIkqQULZ0mSJKkFC2dJkiSphf8PA6+SxgdiH0YAAAAASUVORK5CYII=\n",
|
||
"text/plain": [
|
||
"<Figure size 864x288 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"df[(df['Quantity']>-50) & \n",
|
||
" (df['Quantity']<50) & \n",
|
||
" (df['UnitPrice']>0) & \n",
|
||
" (df['UnitPrice']<100)][['Quantity', 'UnitPrice']].hist(figsize=[12,4], bins=30, log=True)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Country</th>\n",
|
||
" <th>CustomerID</th>\n",
|
||
" <th>...</th>\n",
|
||
" <th>StockCode</th>\n",
|
||
" <th>UnitPrice</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1228</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15485.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22086</td>\n",
|
||
" <td>2.55</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1237</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22444</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1286</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84050</td>\n",
|
||
" <td>1.25</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1293</th>\n",
|
||
" <td>Norway</td>\n",
|
||
" <td>12433.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22197</td>\n",
|
||
" <td>0.85</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1333</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>18144.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>84879</td>\n",
|
||
" <td>1.69</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14784</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22423</td>\n",
|
||
" <td>10.95</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14785</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22075</td>\n",
|
||
" <td>1.45</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14788</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>15061.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>17038</td>\n",
|
||
" <td>0.07</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14974</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>21704</td>\n",
|
||
" <td>0.72</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14980</th>\n",
|
||
" <td>United Kingdom</td>\n",
|
||
" <td>14739.0</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>22178</td>\n",
|
||
" <td>1.06</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>\n",
|
||
"<p>258 rows × 8 columns</p>"
|
||
],
|
||
"text/plain": [
|
||
" Country CustomerID ... StockCode UnitPrice\n",
|
||
"1228 United Kingdom 15485.0 ... 22086 2.55\n",
|
||
"1237 Norway 12433.0 ... 22444 1.06\n",
|
||
"1286 Norway 12433.0 ... 84050 1.25\n",
|
||
"1293 Norway 12433.0 ... 22197 0.85\n",
|
||
"1333 United Kingdom 18144.0 ... 84879 1.69\n",
|
||
"... ... ... ... ... ...\n",
|
||
"14784 United Kingdom 15061.0 ... 22423 10.95\n",
|
||
"14785 United Kingdom 15061.0 ... 22075 1.45\n",
|
||
"14788 United Kingdom 15061.0 ... 17038 0.07\n",
|
||
"14974 United Kingdom 14739.0 ... 21704 0.72\n",
|
||
"14980 United Kingdom 14739.0 ... 22178 1.06\n",
|
||
"\n",
|
||
"[258 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df.query('Quantity>50 & UnitPrice<100')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Arithmetic Operations"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Numeric values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1\n",
|
||
"1001 1\n",
|
||
"1002 1\n",
|
||
"1003 1\n",
|
||
"1004 12\n",
|
||
"Name: Quantity, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Quantity'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 0.29\n",
|
||
"Name: UnitPrice, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['UnitPrice'].head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"product = df['Quantity'] * df['UnitPrice']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 1.25\n",
|
||
"1001 1.25\n",
|
||
"1002 1.25\n",
|
||
"1003 1.25\n",
|
||
"1004 3.48\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"product.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"String concatenation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"1000 United Kingdom21123\n",
|
||
"1001 United Kingdom21124\n",
|
||
"1002 United Kingdom21122\n",
|
||
"1003 United Kingdom84378\n",
|
||
"1004 United Kingdom21985\n",
|
||
" ... \n",
|
||
"14995 United Kingdom72349B\n",
|
||
"14996 United Kingdom72741\n",
|
||
"14997 United Kingdom22762\n",
|
||
"14998 United Kingdom21773\n",
|
||
"14999 United Kingdom22149\n",
|
||
"Length: 15000, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df['Country'] + df['StockCode']"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.6.9"
|
||
},
|
||
"pycharm": {
|
||
"stem_cell": {
|
||
"cell_type": "raw",
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"source": []
|
||
}
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|