123 Commits

Author SHA1 Message Date
P. Sai Vinay
27717eead1
Remove deprecated options and aliases 2021-01-04 13:20:45 -06:00
P. Sai Vinay
473db4576b
Move tests directory outside of eland namespace 2020-11-16 11:30:41 -06:00
P. Sai Vinay
56f6ba6c8b
Add Elasticsearch storage usage to df.info() 2020-11-16 10:07:28 -06:00
P. Sai Vinay
4e92e3cf62
Fix Eland logo and update contributing documentation 2020-11-06 09:33:30 -06:00
Seth Michael Larson
cb4cd083c3
Add support for es_match() to DataFrame and Series 2020-10-29 10:16:50 -05:00
Seth Michael Larson
ae96558075
Add source for 'elastic.co/guide' to 'docs/guide' 2020-10-28 07:57:10 -05:00
Seth Michael Larson
28951c0ad1
Add linting+docs to GitHub Actions, fix docs 2020-10-27 11:28:55 -05:00
Seth Michael Larson
ae70f03df3
Document DataFrame.groupby() methods 2020-10-27 10:10:57 -05:00
Seth Michael Larson
05a24cbe0b Add isort, rename Nox session to 'format' 2020-10-15 17:11:29 -05:00
Seth Michael Larson
18fb4af731 Document DataFrame.groupby() and rename Field.index -> .column 2020-10-15 17:11:29 -05:00
Seth Michael Larson
adafeed667
Add es_dtypes property to DataFrame and Series 2020-10-13 12:14:09 -05:00
Seth Michael Larson
c86371733d
Deprecate ImportedMLModel in favor of MLModel.import_model() 2020-09-03 09:06:59 -05:00
Seth Michael Larson
4576951f37
Fix links in Implementation section 2020-08-17 16:32:48 -05:00
Seth Michael Larson
661b33dd0a Update and rearrange documentation 2020-08-17 15:55:06 -05:00
Seth Michael Larson
5bf205a1e0
Fix Series.describe(), median agg dtype 2020-08-17 09:28:30 -05:00
Seth Michael Larson
f5b37e643c Update support matrix for Pandas 1.1 2020-08-14 12:55:02 -05:00
Seth Michael Larson
140623283a
Support Series/collections in Series.isin(), add type hints 2020-07-14 11:39:52 -05:00
Seth Michael Larson
6c2f9a2ed2
Add DataFrame.size and Series.size 2020-07-13 17:30:14 -05:00
Seth Michael Larson
d50e06dda5
Add webinar recording link to notebook 2020-07-10 14:21:55 -05:00
Seth Michael Larson
ceacf759c3
Add long Apache-2.0 license header to all files 2020-07-08 15:10:43 -05:00
Seth Michael Larson
5897b4587c
Add webinar example notebook, update prose in docs 2020-07-08 14:44:40 -05:00
Seth Michael Larson
eff9625be1 Update docs with all new APIs 2020-05-20 13:58:40 -05:00
Daniel Mesejo-León
890cf6dc97
Add Series.isna() and Series.notna() 2020-05-19 16:12:59 -05:00
Seth Michael Larson
1378544933
Normalize and prune top-level APIs 2020-05-18 14:55:41 -05:00
Seth Michael Larson
d1444f8e09 Add Conda Forge installation instructions 2020-05-15 15:27:41 -05:00
Daniel Mesejo-León
94dbb36081
Add .sample() method to DataFrame and Series 2020-05-04 12:07:21 -05:00
Seth Michael Larson
fa8dbe0eb4
Restore documentation requirements 2020-04-29 13:57:51 -05:00
Seth Michael Larson
3d81def5cc
Add support for xgboost v1 2020-04-29 13:06:35 -05:00
Seth Michael Larson
15a1977dcf
Add agg compatibility logic to Field class 2020-04-27 15:16:48 -05:00
Seth Michael Larson
7946eb4daa
Add an enforce license headers 2020-04-25 16:26:58 -05:00
Seth Michael Larson
448770df78
Restrict public API, update license header 2020-04-14 07:31:23 -05:00
Daniel Mesejo-León
023a35c3b4
Add instructions for how to build docs 2020-04-03 07:53:27 -05:00
Seth Michael Larson
7e5f0d3913 Add DataFrame.es_query() to query Elasticsearch directly 2020-04-02 13:06:22 -05:00
Daniel Mesejo-León
e27a508c59
Update supported Pandas to v1.0 2020-03-27 12:21:15 -05:00
Seth Michael Larson
0c1d7222fe
Drop support for Python 3.5, add Black 2020-03-27 07:56:28 -05:00
Stephen Dodson
2c29e28a2f Updating logo 2020-03-13 09:17:56 +00:00
Stephen Dodson
206677818f Fixes to enforce xgboost==0.90
Issue raised to upgrade xgboost version
2020-02-24 09:20:36 +00:00
stevedodson
1a90e9232e
7.6.0a3 (#131)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues

* Updating ML docs

* Bumping version following doc fixes

* Change ExternalMLModel to ImportedMLModel
2020-02-15 20:29:03 +01:00
stevedodson
163d18d84e
Updating ML docs (#129)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues

* Updating ML docs
2020-02-15 19:52:04 +01:00
stevedodson
1cfcd0ab2b
Resolving docs issues (#128)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues
2020-02-15 19:37:41 +01:00
stevedodson
7c1c2945a7
ML add externral models (#125)
* Partially implemented implementation of ml.ExternalModel

* Adding eland.ml.ExternalMLModel

More testing to be added + more support for MLModels
2020-02-15 15:54:29 +01:00
stevedodson
409cb043c8
Refactoring of plotting + fixes for multiple charts (#117)
* Refactoring of plotting + fixes for multiple charts

Updates to plotting inline with pandas 0.25.3
Enables plotting of multiple histograms on the
same figure.

* Fix to setup.py to allow submodules

+ reformat of code and better Series.hist docs
2020-01-29 07:07:56 +00:00
stevedodson
46b428d59b
Improved read_csv docs + made 'to_eland' params consistent (#114)
* Improved read_csv docs + made 'to_eland' params consistent

Note, will change API.

* Removing additional args from pytest.

doctests + nbval tests in the CI are not addressed by
this PR.
2020-01-16 10:17:49 +00:00
stevedodson
1914644f93
Improve docs (#113)
* Adding more examples

* Adding more examples to README.md + pypi first page.

* Updated README.md
2020-01-13 15:32:41 +00:00
stevedodson
00fb775d29
Feature/versioning (#109)
* Minor fixes for readthedocs compatibility.

* Adding doc templates

* Setting first version to 7.5

* Resolving pypi issues + minor docs
2020-01-10 14:38:56 +00:00
stevedodson
1c772d0e50
More readthedocs fixes. (#107)
* Minor fixes for readthedocs compatibility.

* Adding doc templates
2020-01-10 11:33:51 +00:00
stevedodson
679f8f4170
Minor fixes for readthedocs compatibility. (#106) 2020-01-10 11:02:51 +00:00
stevedodson
a3293168a1
Feature/filtered hist (#104)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5

* Adding simple cache for mappings in flatten

Improves performance significantly on large
datasets (>10000 rows).

* Adding updated notebooks (new info_es).

All tests (doc + nbval + pytest) pass.

* Fixing issue with non-zero offset histograms.
2020-01-10 08:17:45 +00:00
stevedodson
903fbf0341
Feature/mapping cache (#103)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5

* Adding simple cache for mappings in flatten

Improves performance significantly on large
datasets (>10000 rows).

* Adding updated notebooks (new info_es).

All tests (doc + nbval + pytest) pass.
2020-01-10 08:12:03 +00:00
stevedodson
efe21a6d87
Feature/arithmetic ops (#102)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5
2020-01-10 08:05:43 +00:00