Minor updates to README.md + merge fixes (#105)

This commit is contained in:
stevedodson 2020-01-10 09:26:13 +00:00 committed by GitHub
parent a3293168a1
commit c3c2f8a020
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 31 additions and 11 deletions

View File

@ -1,13 +1,16 @@
[![Build Status](https://clients-ci.elastic.co/buildStatus/icon?job=elastic%2Beland%2Bmaster)](https://clients-ci.elastic.co/job/elastic+eland+master/)
_Note, this project is still very much a work in progress and in an alpha state; input and contributions welcome!_
# What is it?
eland is a elasticsearch client Python package to analyse, explore and manipulate data that resides in elasticsearch.
eland is a Elasticsearch client Python package to analyse, explore and manipulate data that resides in Elasticsearch.
Where possible the package uses existing Python APIs and data structures to make it easy to switch between numpy,
pandas, scikit-learn to their elasticsearch powered equivalents. In general, the data resides in elasticsearch and
not in memory, which allows eland to access large datasets stored in elasticsearch.
pandas, scikit-learn to their Elasticsearch powered equivalents. In general, the data resides in Elasticsearch and
not in memory, which allows eland to access large datasets stored in Elasticsearch.
For example, to explore data in a large elasticsearch index, simply create an eland DataFrame from an elasticsearch index pattern, and explore using an API that mirrors a subset of the pandas.DataFrame API:
For example, to explore data in a large Elasticsearch index, simply create an eland DataFrame from an Elasticsearch
index pattern, and explore using an API that mirrors a subset of the pandas.DataFrame API:
```
>>> import eland as ed
@ -49,7 +52,6 @@ max 400140.000000 246.000000 5.000000
```
## Development Setup
1. Create a virtual environment in Python
@ -72,11 +74,31 @@ source env/bin/activate
pip install -r requirements.txt
```
## Versions and Compatibility
### Python Version Support
Officially Python 3.5.3 and above, 3.6, 3.7, and 3.8.
eland depends on pandas version 0.25.3.
#### Elasticsearch Versions
eland is versioned like the Elastic stack. E.g. eland 7.5 compatible with Elasticsearch 7.5 etc.
A major version of the client is compatible with the same major version of Elasticsearch.
No compatibility assurances are given between different major versions of the client and Elasticsearch.
Major differences likely exist between major versions of Elasticsearch,
particularly around request and response object formats, but also around API urls and behaviour.
## Why eland?
Naming is difficult, but as we had to call it something:
* eland = elastic and data
* eland = 'Elk/Moose' in Dutch (Alces alces)
* Elandsgracht = Amsterdam street near Elastic's Amsterdam office where historically hides from, among others, Elk were worked
* eland: elastic and data
* eland: 'Elk/Moose' in Dutch (Alces alces)
* [Elandsgracht](https://goo.gl/maps/3hGBMqeGRcsBJfKx8): Amsterdam street near Elastic's Amsterdam office
[Pronunciation](https://commons.wikimedia.org/wiki/File:Nl-eland.ogg): /ˈeːlɑnt/

View File

@ -45,7 +45,6 @@ class TestDataFrameToCSV(TestData):
assert_frame_equal(pd_flights, pd_from_csv)
def test_to_csv_full(self):
return
results_file = ROOT_DIR + '/dataframe/results/test_to_csv_full.csv'
# Test is slow as it's for the full dataset, but it is useful as it goes over 10000 docs

View File

@ -56,6 +56,5 @@ class TestDataFrameUtils(TestData):
assert_pandas_eland_frame_equal(df, ed_df_head)
def test_eland_to_pandas_performance(self):
# TODO - commented out for now for performance reasons
return
# TODO quantify this
pd_df = ed.eland_to_pandas(self.ed_flights())