586 Commits

Author SHA1 Message Date
Colleen McGinnis
ca64672fd7
[docs] Migrate docs from AsciiDoc to Markdown (#762)
Co-authored-by: István Zoltán Szabó <szabosteve@gmail.com>
2025-02-26 17:48:16 +01:00
Colleen McGinnis
6692251d9e
add the new ci checks (#761) 2025-02-26 16:40:43 +01:00
David Kyle
ee4d701aa4
Upgrade transformers to 4.47 (#752)
The upgrade fixes a crash tracing the baai/bge-m3 model
2025-02-12 17:30:45 +00:00
Quentin Pradet
acdeeeded2
Allow nox 2025.02.09 (#754) 2025-02-12 16:33:59 +04:00
Quentin Pradet
8350f06ea8
Fix pipeline labels (#751) 2025-02-12 15:07:51 +04:00
Quentin Pradet
e846fb7697
Add backport action (#750) 2025-02-12 15:07:43 +04:00
Quentin Pradet
c4ac64e3a0
Allow scikit-learn 1.5 to address CVE-2024-5206 (#729) 2025-02-12 14:34:13 +04:00
Jan Calanog
214c4645e9
github-action: Add AsciiDoc freeze warning (#748)
* github-action: Add AsciiDoc freeze warning

* Update .github/workflows/comment-on-asciidoc-changes.yml
2025-02-12 07:45:07 +04:00
Quentin Pradet
871e52b37a
Pin nox to avoid session.env issue (#753) 2025-02-11 18:36:57 +04:00
Quentin Pradet
aa5196edee
Switch to black's 2025 code style (#749) 2025-02-11 14:57:16 +04:00
Bart Broere
75c57b0775
Support Pandas 2 (#742)
* Fix test setup to match pandas 2.0 demands

* Use the now deprecated _append method

(Better solution might exist)

* Deal with numeric_only being removed in metrics test

* Skip mad metric for other pandas versions

* Account for differences between pandas versions in describe methods

* Run black

* Check Pandas version first

* Mirror behaviour of installed Pandas version when running value_counts

* Allow passing arguments to the individual asserters

* Fix for method _construct_axes_from_arguments no longer existing

* Skip mad metric if it does not exist

* Account for pandas 2.0 timestamp default behaviour

* Deal with empty vs other inferred data types

* Account for default datetime precision change

* Run Black

* Solution for differences in inferred_type only

* Fix csv and json issues

* Skip two doctests

* Passing a set as indexer is no longer allowed

* Don't validate output where it differs between Pandas versions in the environment

* Update test matrix and packaging metadata

* Update version of Python in the docs

* Update Python version in demo notebook

* Match noxfile

* Symmetry

* Fix trailing comma in JSON

* Revert some changes in setup.py to fix building the documentation

* Revert "Revert some changes in setup.py to fix building the documentation"

This reverts commit ea9879753129d8d8390b3cbbce57155a8b4fb346.

* Use PANDAS_VERSION from eland.common

* Still skip the doctest, but make the output pandas 2 instead of 1

* Still skip doctest, but switch to pandas 2 output

* Prepare for pandas 3

* Reference the right column

* Ignore output in tests but switch to pandas 2 output

* Add line comment about NBVAL_IGNORE_OUTPUT

* Restore missing line and add stderr cell

* Use non-private method instead

* Fix indentation and parameter issues

* If index is not specified, and pandas 1 is present, set it to True

From pandas 2 and upwards, index is set to None by default

* Run black

* Newer version of black might have different opinions?

* Add line comment

* Remove unused import

* Add reason for ignore statement

* Add reason for skip

---------

Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
2025-02-04 17:43:43 +04:00
Valeriy Khakhutskyy
77589b26b8
Remove ML model export as sklearn Pipeline and clean up code (#744)
* Revert "[ML] Export ML model as sklearn Pipeline (#509)"

This reverts commit 0576114a1d886eafabca3191743a9bea9dc20b1a.

* Keep useful changes

* formatting

* Remove obsolete test matrix configuration and update version references in documentation and Noxfile

* formatting

---------

Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
2025-02-04 11:36:50 +04:00
Bart Broere
9b5badb941
Drop Python 3.8 support and introduce Python 3.12 CI/CD (#743) 2025-01-22 21:55:57 +04:00
Quentin Pradet
f99adce23f
Build documentation using Docker again (#746) 2025-01-14 18:16:39 +04:00
Quentin Pradet
7774a506ae
Release 8.17.0 v8.17.0 2025-01-07 10:58:59 +04:00
Dai Sugimori
82492fe771
Expansion support (#740) 2024-11-23 00:20:58 +09:00
Quentin Pradet
04102f2a4e
Release 8.16.0 v8.16.0 2024-11-14 09:07:39 +04:00
Valeriy Khakhutskyy
9aec8fc751
Add deprecation warning for ESGradientBoostingModel subclasses (#738)
Introduce a warning indicating that exporting data frame analytics models as ESGradientBoostingModel subclasses is deprecated and will be removed in version 9.0.0.

The implementation of ESGradientBoostingModel relies on importing undocumented private classes that were changed in 1.4 to https://github.com/scikit-learn/scikit-learn/pull/26278. This dependency makes the code difficult to maintain, while the functionality is not widely used by users. Therefore, we will deprecate this functionality in 8.16 and remove it completely in 9.0.0. 

---------

Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>
2024-11-11 14:26:11 +01:00
Quentin Pradet
79d9a6ae29
Release 8.15.4 v8.15.4 2024-10-18 10:52:52 +04:00
Quentin Pradet
939f4d672c
Revert "Add feedback request to README" (#735) 2024-10-18 08:06:42 +04:00
Quentin Pradet
1312e96220
Revert "Allow reading Elasticsearch certs in Wolfi image" (#734)
This reverts commit 5dabe9c0996e62d8bf4b493dcea7d4bc161dead4.
2024-10-11 16:52:41 +04:00
Quentin Pradet
2916b51fa7
Release 8.15.3 v8.15.3 2024-10-09 16:16:52 +04:00
Quentin Pradet
5dabe9c099
Allow reading Elasticsearch certs in Wolfi image (#732)
The config/certs directory of Elasticsearch is not readable by other
users and groups. This work in the public image, which uses the root
user, but the Wolfi image does not. Using the same user id fixes the
problem.
2024-10-09 15:37:05 +04:00
Max Hniebergall
06b65e211e
Add support for DeBERTa-V2 tokenizer (#717) 2024-10-03 14:04:19 -04:00
Quentin Pradet
a45c7bc357
Release 8.15.2 v8.15.2 2024-10-02 13:54:03 +04:00
Quentin Pradet
d1e533ffb9
Fix Docker image build on Linux (#728)
* Fix Docker image build on Linux

* Build Docker images in CI

* Fix bash syntax

* Only load, not push

* Parallelize docker build

It's currently the slowest step.

* Only build Linux images
2024-10-02 10:33:35 +04:00
Quentin Pradet
a83ce20fcc
Release 8.15.1 v8.15.1 2024-10-01 15:31:24 +04:00
David Kyle
03af8a6319
Fix path in docker model upload example (#726) 2024-10-01 08:53:28 +01:00
David Kyle
5253501704
Upgrade PyTorch to version 2.3.1 (#718)
Upgrades the PyTorch, transformers and sentence transformer requirements.
Elasticsearch has upgraded to PyTorch to 2.3.1 in 8.16 and 8.15.2. For 
compatibility reasons Eland will refuse to upload to an Elasticsearch cluster 
that has is using an earlier version of PyTorch.
2024-09-30 10:22:02 +01:00
David Kyle
ec66b5f320
Add ES 8.16 and 8.15.2 to test matrix (#725) 2024-09-27 13:37:31 +01:00
Quentin Pradet
64d05e4c68
Restore public Dockerfile (#722) 2024-09-25 12:49:46 +04:00
Quentin Pradet
f79180be42
Migrate to Wolfi base Docker image (#720) 2024-09-03 18:02:08 +04:00
Miguel Grinberg
0ce3db26e8
Release 8.15.0 (#715)
* Release 8.15.0

* update release notes
v8.15.0
2024-08-13 09:47:48 +01:00
David Kyle
5a76f826df
Add note about using text_similarity for rerank to the CLI (#716) 2024-08-12 14:40:12 +01:00
David Kyle
fd8886da6a
Default truncation to second for text similarity the task type(#713)
In reranking the first input (the query) is generally shorter. In this case
it makes more sense to truncate the second input (the document text)
2024-08-05 11:47:15 +01:00
Aurélien FOUCRET
bee6d0e1f7
Remove input fields from exported LTR models (#708) 2024-07-05 14:31:22 +02:00
Bart Broere
f18aa35e8e
Deal with the possibility of lists (#707) 2024-06-28 22:25:47 +04:00
Quentin Pradet
56a46d0f85
Rename Buildkite team from clients-team to devtools-team (#702) 2024-06-12 11:39:25 +04:00
Quentin Pradet
c497683064
Quote remaining eland[pytorch] for ZSH users (#701) 2024-06-10 16:50:03 +00:00
Quentin Pradet
0ddc21b895
Release 8.14.0 v8.14.0 2024-06-10 15:56:43 +04:00
István Zoltán Szabó
5a3e7d78b3
[DOCS] Completes the list of available NLP task types. (#699) 2024-06-10 12:30:07 +02:00
Bart Broere
1014ecdb39
Fix non _source fields missing from the result hits (#693) 2024-06-10 11:09:52 +04:00
David Kyle
632074c0f0
Make eland_import_hub_model script compatible with serverless (#698)
Checks for build_flavor == serverless rather than a version
2024-06-07 14:46:12 +01:00
Bart Broere
35a96ab3f0
Fix missing method str.removeprefix in Python 3.8 (#695) 2024-05-24 10:25:04 +04:00
Quentin Pradet
116416b3e8
Stop duplicating requirements (#691) 2024-05-14 15:59:39 +04:00
Ashok Kumar
5b728c29c1
Replace check for Elasticsearch to str/list in ensure_es_client (#690) 2024-05-04 09:01:31 +04:00
Quentin Pradet
e76b32eee2
Release 8.13.1 v8.13.1 2024-05-03 09:20:45 +04:00
Quentin Pradet
fd38e26df1
Support HTTP proxies in eland_import_hub_model (#688)
* Document TLS/SSL options for import script

* Mention --help option

* Add HTTP proxy support

* Mention HTTP_PROXY too

---------

Co-authored-by: David Kyle <david.kyle@elastic.co>
2024-05-02 21:03:44 +04:00
Quentin Pradet
f7f6e0aba9
Document TLS/SSL options for import script (#667) 2024-05-02 18:06:40 +04:00
Aurélien FOUCRET
9cea2385e6
Work around LTR model cache in tests (#685) 2024-04-08 14:00:36 +04:00