113 Commits

Author SHA1 Message Date
Seth Michael Larson
adafeed667
Add es_dtypes property to DataFrame and Series 2020-10-13 12:14:09 -05:00
Seth Michael Larson
c86371733d
Deprecate ImportedMLModel in favor of MLModel.import_model() 2020-09-03 09:06:59 -05:00
Seth Michael Larson
4576951f37
Fix links in Implementation section 2020-08-17 16:32:48 -05:00
Seth Michael Larson
661b33dd0a Update and rearrange documentation 2020-08-17 15:55:06 -05:00
Seth Michael Larson
5bf205a1e0
Fix Series.describe(), median agg dtype 2020-08-17 09:28:30 -05:00
Seth Michael Larson
f5b37e643c Update support matrix for Pandas 1.1 2020-08-14 12:55:02 -05:00
Seth Michael Larson
140623283a
Support Series/collections in Series.isin(), add type hints 2020-07-14 11:39:52 -05:00
Seth Michael Larson
6c2f9a2ed2
Add DataFrame.size and Series.size 2020-07-13 17:30:14 -05:00
Seth Michael Larson
d50e06dda5
Add webinar recording link to notebook 2020-07-10 14:21:55 -05:00
Seth Michael Larson
ceacf759c3
Add long Apache-2.0 license header to all files 2020-07-08 15:10:43 -05:00
Seth Michael Larson
5897b4587c
Add webinar example notebook, update prose in docs 2020-07-08 14:44:40 -05:00
Seth Michael Larson
eff9625be1 Update docs with all new APIs 2020-05-20 13:58:40 -05:00
Daniel Mesejo-León
890cf6dc97
Add Series.isna() and Series.notna() 2020-05-19 16:12:59 -05:00
Seth Michael Larson
1378544933
Normalize and prune top-level APIs 2020-05-18 14:55:41 -05:00
Seth Michael Larson
d1444f8e09 Add Conda Forge installation instructions 2020-05-15 15:27:41 -05:00
Daniel Mesejo-León
94dbb36081
Add .sample() method to DataFrame and Series 2020-05-04 12:07:21 -05:00
Seth Michael Larson
fa8dbe0eb4
Restore documentation requirements 2020-04-29 13:57:51 -05:00
Seth Michael Larson
3d81def5cc
Add support for xgboost v1 2020-04-29 13:06:35 -05:00
Seth Michael Larson
15a1977dcf
Add agg compatibility logic to Field class 2020-04-27 15:16:48 -05:00
Seth Michael Larson
7946eb4daa
Add an enforce license headers 2020-04-25 16:26:58 -05:00
Seth Michael Larson
448770df78
Restrict public API, update license header 2020-04-14 07:31:23 -05:00
Daniel Mesejo-León
023a35c3b4
Add instructions for how to build docs 2020-04-03 07:53:27 -05:00
Seth Michael Larson
7e5f0d3913 Add DataFrame.es_query() to query Elasticsearch directly 2020-04-02 13:06:22 -05:00
Daniel Mesejo-León
e27a508c59
Update supported Pandas to v1.0 2020-03-27 12:21:15 -05:00
Seth Michael Larson
0c1d7222fe
Drop support for Python 3.5, add Black 2020-03-27 07:56:28 -05:00
Stephen Dodson
2c29e28a2f Updating logo 2020-03-13 09:17:56 +00:00
Stephen Dodson
206677818f Fixes to enforce xgboost==0.90
Issue raised to upgrade xgboost version
2020-02-24 09:20:36 +00:00
stevedodson
1a90e9232e
7.6.0a3 (#131)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues

* Updating ML docs

* Bumping version following doc fixes

* Change ExternalMLModel to ImportedMLModel
2020-02-15 20:29:03 +01:00
stevedodson
163d18d84e
Updating ML docs (#129)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues

* Updating ML docs
2020-02-15 19:52:04 +01:00
stevedodson
1cfcd0ab2b
Resolving docs issues (#128)
* Updating test matrix for 7.6 + removing oss for now.

* Resolving 7.6.0 docs issues
2020-02-15 19:37:41 +01:00
stevedodson
7c1c2945a7
ML add externral models (#125)
* Partially implemented implementation of ml.ExternalModel

* Adding eland.ml.ExternalMLModel

More testing to be added + more support for MLModels
2020-02-15 15:54:29 +01:00
stevedodson
409cb043c8
Refactoring of plotting + fixes for multiple charts (#117)
* Refactoring of plotting + fixes for multiple charts

Updates to plotting inline with pandas 0.25.3
Enables plotting of multiple histograms on the
same figure.

* Fix to setup.py to allow submodules

+ reformat of code and better Series.hist docs
2020-01-29 07:07:56 +00:00
stevedodson
46b428d59b
Improved read_csv docs + made 'to_eland' params consistent (#114)
* Improved read_csv docs + made 'to_eland' params consistent

Note, will change API.

* Removing additional args from pytest.

doctests + nbval tests in the CI are not addressed by
this PR.
2020-01-16 10:17:49 +00:00
stevedodson
1914644f93
Improve docs (#113)
* Adding more examples

* Adding more examples to README.md + pypi first page.

* Updated README.md
2020-01-13 15:32:41 +00:00
stevedodson
00fb775d29
Feature/versioning (#109)
* Minor fixes for readthedocs compatibility.

* Adding doc templates

* Setting first version to 7.5

* Resolving pypi issues + minor docs
2020-01-10 14:38:56 +00:00
stevedodson
1c772d0e50
More readthedocs fixes. (#107)
* Minor fixes for readthedocs compatibility.

* Adding doc templates
2020-01-10 11:33:51 +00:00
stevedodson
679f8f4170
Minor fixes for readthedocs compatibility. (#106) 2020-01-10 11:02:51 +00:00
stevedodson
a3293168a1
Feature/filtered hist (#104)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5

* Adding simple cache for mappings in flatten

Improves performance significantly on large
datasets (>10000 rows).

* Adding updated notebooks (new info_es).

All tests (doc + nbval + pytest) pass.

* Fixing issue with non-zero offset histograms.
2020-01-10 08:17:45 +00:00
stevedodson
903fbf0341
Feature/mapping cache (#103)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5

* Adding simple cache for mappings in flatten

Improves performance significantly on large
datasets (>10000 rows).

* Adding updated notebooks (new info_es).

All tests (doc + nbval + pytest) pass.
2020-01-10 08:12:03 +00:00
stevedodson
efe21a6d87
Feature/arithmetic ops (#102)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.

* Adding support for multiple arithmetic operations.

Added new 'arithmetics' file to manage this process.
More tests to be added + cleanup.

* Signficant refactor to arithmetics and mappings.

Work in progress. Tests don't pass.

* Major refactor to Mappings.

Field name mappings were stored in different places
(Mappings, QueryCompiler, Operations) and needed to
be keep in sync.

With the addition of complex arithmetic operations
this became complex and difficult to maintain. Therefore,
all field naming is now in 'FieldMappings' which
replaces 'Mappings'.

Note this commit removes the cache for some of the
mapped values and so the code is SIGNIFICANTLY
slower on large indices.

In addition, the addition of date_format to
Mappings has been removed. This again added more
unncessary complexity.

* Adding OrderedDict for 3.5 compatibility

* Fixes to ordering issues with 3.5
2020-01-10 08:05:43 +00:00
stevedodson
5a3c73ea54
Feature/info es fix (#99)
* Resolving inconsistent __repr__ test on python 3.5

* Fixing layout for info_es + adding Series.hist doc
2019-12-12 14:36:56 +01:00
Michael Hirsch
79fdb1727e
Add Support for Series Histograms (#95)
* add support for series plotting
* update docs for series plotting support
* add tests for series plotting
* fix typo
* adds comment to ed_hist_series
2019-12-11 14:51:47 -05:00
stevedodson
c5730e6d38
Feature/python 3.5 (#93)
* Adding python 3.5 compatibility.

Main issue is ordering of dictionaries.

* Updating notebooks with 3.7 results.

* Removing tempoorary code.

* Defaulting to OrderedDict for python 3.5 + lint all code

All code reformated by PyCharm and inspection results analysed.
2019-12-11 14:27:35 +01:00
stevedodson
9a2d55f3c8
Feature/pandas.0.25.3 (#92)
* Resolving pandas link

* Removing temporary file
2019-12-10 19:22:27 +01:00
stevedodson
e8a0fbb9f3
Feature/pandas.0.25.3 (#91)
* Added example notebooks + pytest for these notebooks1

* Fixed paths

* Fixing link in docs

* Minor update for pandas 0.25.3

* Updates for pandas 0.25.3

* Fixing doc links with pandas 0.25.3 update.

* Reverting overwrite to changes to notebooks.
2019-12-10 16:05:37 +01:00
stevedodson
133b227b93
Added example notebooks + pytest for notebooks (#87)
* Added example notebooks + pytest for these notebooks1

* Fixed paths

* Fixing link in docs

* Adding cleaner demo_notebook
2019-12-10 15:27:13 +01:00
stevedodson
206276c5fa
Adding Apache 2 copyright header to all .py files (#86) 2019-12-06 09:44:05 +00:00
Stephen Dodson
86686ebb18 Reformat and cleanup based on PyCharm 2019-11-26 11:02:46 +00:00
stevedodson
5ce315f55c
Merge pull request #64 from stevedodson/feature/arithmetics
Series arithmetics, series metric aggs, series docs
2019-11-25 16:17:12 +00:00
Stephen Dodson
85422e2023 Adding series __r* docs 2019-11-25 15:49:27 +00:00