48 Commits

Author SHA1 Message Date
Quentin Pradet
aa5196edee
Switch to black's 2025 code style (#749) 2025-02-11 14:57:16 +04:00
Bart Broere
75c57b0775
Support Pandas 2 (#742)
* Fix test setup to match pandas 2.0 demands

* Use the now deprecated _append method

(Better solution might exist)

* Deal with numeric_only being removed in metrics test

* Skip mad metric for other pandas versions

* Account for differences between pandas versions in describe methods

* Run black

* Check Pandas version first

* Mirror behaviour of installed Pandas version when running value_counts

* Allow passing arguments to the individual asserters

* Fix for method _construct_axes_from_arguments no longer existing

* Skip mad metric if it does not exist

* Account for pandas 2.0 timestamp default behaviour

* Deal with empty vs other inferred data types

* Account for default datetime precision change

* Run Black

* Solution for differences in inferred_type only

* Fix csv and json issues

* Skip two doctests

* Passing a set as indexer is no longer allowed

* Don't validate output where it differs between Pandas versions in the environment

* Update test matrix and packaging metadata

* Update version of Python in the docs

* Update Python version in demo notebook

* Match noxfile

* Symmetry

* Fix trailing comma in JSON

* Revert some changes in setup.py to fix building the documentation

* Revert "Revert some changes in setup.py to fix building the documentation"

This reverts commit ea9879753129d8d8390b3cbbce57155a8b4fb346.

* Use PANDAS_VERSION from eland.common

* Still skip the doctest, but make the output pandas 2 instead of 1

* Still skip doctest, but switch to pandas 2 output

* Prepare for pandas 3

* Reference the right column

* Ignore output in tests but switch to pandas 2 output

* Add line comment about NBVAL_IGNORE_OUTPUT

* Restore missing line and add stderr cell

* Use non-private method instead

* Fix indentation and parameter issues

* If index is not specified, and pandas 1 is present, set it to True

From pandas 2 and upwards, index is set to None by default

* Run black

* Newer version of black might have different opinions?

* Add line comment

* Remove unused import

* Add reason for ignore statement

* Add reason for skip

---------

Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
2025-02-04 17:43:43 +04:00
Valeriy Khakhutskyy
77589b26b8
Remove ML model export as sklearn Pipeline and clean up code (#744)
* Revert "[ML] Export ML model as sklearn Pipeline (#509)"

This reverts commit 0576114a1d886eafabca3191743a9bea9dc20b1a.

* Keep useful changes

* formatting

* Remove obsolete test matrix configuration and update version references in documentation and Noxfile

* formatting

---------

Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
2025-02-04 11:36:50 +04:00
Bart Broere
9b5badb941
Drop Python 3.8 support and introduce Python 3.12 CI/CD (#743) 2025-01-22 21:55:57 +04:00
David Kyle
5253501704
Upgrade PyTorch to version 2.3.1 (#718)
Upgrades the PyTorch, transformers and sentence transformer requirements.
Elasticsearch has upgraded to PyTorch to 2.3.1 in 8.16 and 8.15.2. For 
compatibility reasons Eland will refuse to upload to an Elasticsearch cluster 
that has is using an earlier version of PyTorch.
2024-09-30 10:22:02 +01:00
Quentin Pradet
116416b3e8
Stop duplicating requirements (#691) 2024-05-14 15:59:39 +04:00
David Kyle
c16e36c051
Add Python 3.11 to support matrix (#681) 2024-03-27 10:34:35 +00:00
Quentin Pradet
02190e74e7
Switch to 2024 black style (#657) 2024-01-31 14:47:19 +04:00
Quentin Pradet
c6ce4b2c46
Fix direct usage of TransformerModel (#619) 2023-10-11 11:56:14 +02:00
Youhei Sakurai
4cf92fd9b7
Make eland_import_hub_model easier to find on Windows. (#559) 2023-07-20 09:24:35 +01:00
David Kyle
36bbbe0bdb
Upgrade torch to 1.13.1 and check the cluster version before uploading a NLP model. (#522)
PyTorch models traced in version 1.13 of PyTorch cannot be evaluated in 
version 1.9 or earlier. With this upgrade Eland becomes incompatible with
pre 8.7 Elasticsearch and will refuse to upload a model to the cluster. 
In this scenario either upgrade Elasticsearch or use an earlier version of Eland.
2023-05-19 16:29:38 +01:00
Valeriy Khakhutskyy
0576114a1d
[ML] Export ML model as sklearn Pipeline (#509)
Closes #503

Note: I also had to fix the Sphinx version to 5.3.0 since, starting from 6.0, Sphinx suffers from a TypeError bug, which causes a CI failure.
2023-02-01 16:17:06 +01:00
Valeriy Khakhutskyy
2ea96322b3
Update to latest ES versions and fix unit tests (#512)
Update the test matrix to the latest Elasticsearch versions and fix the broken unit tests on the CI.
2023-01-31 20:55:29 +01:00
David Kyle
0eb36faa5b
Restrict PyTorch version not to be more advanced than that used in Elasticsearch (#479)
Elasticsearch uses v1.11 of PyTorch. Models created with the latest PyTorch 
release (v1.12) are not compatible with v1.11. This pins the PyTorch version
to 1.11 to prevent the incompatibility. The version of the Elasticsearch Python
client is now required to be >= Eland.

All users of Eland for importing NLP models should upgrade.
2022-07-07 14:56:42 +01:00
Seth Michael Larson
109387184a
Support the v8.0 Elasticsearch client 2021-12-09 15:01:26 -06:00
Josh Devins
5bc1a824a7
Add PyTorch modules to noxfile
We added the `pytorch` module which is type checked but was not in the
noxfile as such. This change also addresses type errors that arose after
adding type checking.
2021-11-29 08:03:25 -08:00
Josh Devins
014943d3b8
Add initial implementation of PyTorch ML models 2021-10-06 08:44:40 -05:00
P. Sai Vinay
f241ae971a
Add flynt and --cov-report=term-missing 2021-09-21 11:18:01 -05:00
P. Sai Vinay
315f94b201
Add excluded lines for coverage and improve coverage 2021-09-07 11:39:19 -05:00
P. Sai Vinay
823f01cc6c
Add type hints to 'eland.operations' and 'eland.ndframe' 2021-08-02 11:50:35 -05:00
P. Sai Vinay
c0e861dc77
Fix installed pandas version on Jenkins 2021-07-31 12:51:11 -05:00
P. Sai Vinay
4c1af42c14
Add idxmax and idxmin methods to DataFrame 2021-07-28 07:55:26 -05:00
Seth Michael Larson
c74fccbd74
Drop support for Python 3.6, pandas<1.2 2021-07-27 14:43:03 -05:00
P. Sai Vinay
22475cdc46
Add PANDAS_VERSION to Jenkins matrix 2021-07-26 11:17:46 -05:00
Seth Michael Larson
a552504f9b
Add support for Pandas 1.2.0 2020-12-30 14:20:36 -06:00
P. Sai Vinay
473db4576b
Move tests directory outside of eland namespace 2020-11-16 11:30:41 -06:00
P. Sai Vinay
75451f1e93
Add pytest-cov for coverage tracking 2020-11-06 11:34:15 -06:00
P. Sai Vinay
54468cb85b
Add pytest --nbval of notebook examples to CI 2020-10-27 15:15:04 -05:00
P. Sai Vinay
e17b4e03ea
Error when es_type_overrides receives unknown columns 2020-10-27 13:48:31 -05:00
Seth Michael Larson
bd7956ea72
Support typed 'elasticsearch-py' and add 'py.typed' 2020-10-20 16:26:58 -05:00
Seth Michael Larson
05a24cbe0b Add isort, rename Nox session to 'format' 2020-10-15 17:11:29 -05:00
P. Sai Vinay
abc5ca927b
Add support for DataFrame.groupby() with aggregations 2020-10-15 10:52:48 -05:00
Seth Michael Larson
c86371733d
Deprecate ImportedMLModel in favor of MLModel.import_model() 2020-09-03 09:06:59 -05:00
Seth Michael Larson
d238bc5d42 Elasticsearch 7.6 only supports scalar leaf_values 2020-08-14 12:55:02 -05:00
Benjamin Trent
6ee282e19f
[ML] Add support for LGBMRegressor models 2020-08-11 07:42:59 -05:00
Seth Michael Larson
140623283a
Support Series/collections in Series.isin(), add type hints 2020-07-14 11:39:52 -05:00
Seth Michael Larson
ceacf759c3
Add long Apache-2.0 license header to all files 2020-07-08 15:10:43 -05:00
Seth Michael Larson
5897b4587c
Add webinar example notebook, update prose in docs 2020-07-08 14:44:40 -05:00
Seth Michael Larson
1378544933
Normalize and prune top-level APIs 2020-05-18 14:55:41 -05:00
Seth Michael Larson
d2047aa51a
Make ML libraries optional, fix type issues 2020-05-14 09:31:01 -05:00
Seth Michael Larson
3d81def5cc
Add support for xgboost v1 2020-04-29 13:06:35 -05:00
Seth Michael Larson
7946eb4daa
Add an enforce license headers 2020-04-25 16:26:58 -05:00
Seth Michael Larson
33b4976f9a
Add type hints to base modules 2020-04-24 12:39:13 -05:00
Seth Michael Larson
448770df78
Restrict public API, update license header 2020-04-14 07:31:23 -05:00
Seth Michael Larson
c8bd25cbea Add doctests to CI 2020-04-02 13:06:22 -05:00
Daniel Mesejo-León
03582b9f5e
Import __version__ and other metadata by name 2020-03-30 07:45:04 -05:00
Daniel Mesejo-León
e27a508c59
Update supported Pandas to v1.0 2020-03-27 12:21:15 -05:00
Seth Michael Larson
0c1d7222fe
Drop support for Python 3.5, add Black 2020-03-27 07:56:28 -05:00