* Revert "[ML] Export ML model as sklearn Pipeline (#509)"
This reverts commit 0576114a1d886eafabca3191743a9bea9dc20b1a.
* Keep useful changes
* formatting
* Remove obsolete test matrix configuration and update version references in documentation and Noxfile
* formatting
---------
Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
I updated the tree serialization format for the new scikit learn versions. I also updated the minimum requirement of scikit learn to 1.3 to ensure compatibility.
Fixes#555
PyTorch models traced in version 1.13 of PyTorch cannot be evaluated in
version 1.9 or earlier. With this upgrade Eland becomes incompatible with
pre 8.7 Elasticsearch and will refuse to upload a model to the cluster.
In this scenario either upgrade Elasticsearch or use an earlier version of Eland.
Shap is incompatible with NumPy 1.24 due to a deprecated usage becoming
an error. There is no fix in Shap yet so an earlier version of NumPy must
be used.
Pandas 2.0 was recently released we will continue to use the latest 1.5 release
to avoid any incompatibilities.
Closes#503
Note: I also had to fix the Sphinx version to 5.3.0 since, starting from 6.0, Sphinx suffers from a TypeError bug, which causes a CI failure.
Elasticsearch uses v1.11 of PyTorch. Models created with the latest PyTorch
release (v1.12) are not compatible with v1.11. This pins the PyTorch version
to 1.11 to prevent the incompatibility. The version of the Elasticsearch Python
client is now required to be >= Eland.
All users of Eland for importing NLP models should upgrade.
In preparation for an 8.0 release, this updates PyTorch NLP dependencies
to more recent and latest minor versions. Amongst other things, this
introduces a fix from transformers that is helpful for text embedding
tasks with certain DPR models.
See: https://github.com/huggingface/transformers/issues/13670
Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
* Allow user to specify es data types in read_csv and pandas_to_eland
Also, some minor maintenance modifications:
- replaced pandas.util.testing with pandas.testing (required in 1.x)
- updated elasticsearch-py requirements to 7.6+ (to support ML code)
* linting file
* Added example notebooks + pytest for these notebooks1
* Fixed paths
* Fixing link in docs
* Minor update for pandas 0.25.3
* Updates for pandas 0.25.3
* Fixing doc links with pandas 0.25.3 update.
* Reverting overwrite to changes to notebooks.