eland

mirror of https://github.com/elastic/eland.git synced 2025-07-24 00:00:39 +08:00

Author	SHA1	Message	Date
David Kyle	5253501704	Upgrade PyTorch to version 2.3.1 (#718 ) Upgrades the PyTorch, transformers and sentence transformer requirements. Elasticsearch has upgraded to PyTorch to 2.3.1 in 8.16 and 8.15.2. For compatibility reasons Eland will refuse to upload to an Elasticsearch cluster that has is using an earlier version of PyTorch.	2024-09-30 10:22:02 +01:00
Miguel Grinberg	0ce3db26e8	Release 8.15.0 (#715 ) * Release 8.15.0 * update release notes	2024-08-13 09:47:48 +01:00
David Kyle	5a76f826df	Add note about using text_similarity for rerank to the CLI (#716 )	2024-08-12 14:40:12 +01:00
David Kyle	fd8886da6a	Default truncation to `second` for text similarity the task type(#713 ) In reranking the first input (the query) is generally shorter. In this case it makes more sense to truncate the second input (the document text)	2024-08-05 11:47:15 +01:00
Aurélien FOUCRET	bee6d0e1f7	Remove input fields from exported LTR models (#708 )	2024-07-05 14:31:22 +02:00
Bart Broere	f18aa35e8e	Deal with the possibility of lists (#707 )	2024-06-28 22:25:47 +04:00
Quentin Pradet	0ddc21b895	Release 8.14.0	2024-06-10 15:56:43 +04:00
Bart Broere	1014ecdb39	Fix non _source fields missing from the result hits (#693 )	2024-06-10 11:09:52 +04:00
David Kyle	632074c0f0	Make eland_import_hub_model script compatible with serverless (#698 ) Checks for build_flavor == serverless rather than a version	2024-06-07 14:46:12 +01:00
Bart Broere	35a96ab3f0	Fix missing method str.removeprefix in Python 3.8 (#695 )	2024-05-24 10:25:04 +04:00
Ashok Kumar	5b728c29c1	Replace check for Elasticsearch to str/list in ensure_es_client (#690 )	2024-05-04 09:01:31 +04:00
Quentin Pradet	e76b32eee2	Release 8.13.1	2024-05-03 09:20:45 +04:00
Quentin Pradet	fd38e26df1	Support HTTP proxies in eland_import_hub_model (#688 ) * Document TLS/SSL options for import script * Mention --help option * Add HTTP proxy support * Mention HTTP_PROXY too --------- Co-authored-by: David Kyle <david.kyle@elastic.co>	2024-05-02 21:03:44 +04:00
Quentin Pradet	1921792df8	Release 8.13.0	2024-03-27 18:18:21 +04:00
David Kyle	ae0bba34c6	Upgrade torch to 2.1.2 (#671 ) Compatible with Elasticsearch 8.13 where the same upgrade has been made	2024-03-26 10:06:50 +00:00
David Kyle	5d34dc3cc4	Add override option to specify the model's max input size(#674 ) If the max input size cannot be found in the configuration the user can specify it as a parameter to the eland_import_hub_model script	2024-03-20 10:02:43 +00:00
Bart Broere	9b335315bb	Mirror pandas' to_csv lineterminator instead of line_terminator (#595 ) * Mirror pandas' to_csv lineterminator instead of line_terminator (even though it looks a little weird perhaps) * Remove squeeze argument * Revert "Merge branch 'remove-squeeze-argument' into patch-2" This reverts commit 8b9ab5647e244d78ec3471b80ee7c42e019cf347. * Don't remove the parameter yet since people might use it * Add pending deprecation warning --------- Co-authored-by: David Kyle <david.kyle@elastic.co>	2024-02-23 14:23:58 +04:00
Bart Broere	33cf029efe	Implement eland.DataFrame.to_json (#661 ) Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>	2024-02-15 11:32:54 +04:00
Aurélien FOUCRET	9d492b03aa	Release 8.12.1 Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>	2024-02-01 10:50:18 +04:00
Quentin Pradet	02190e74e7	Switch to 2024 black style (#657 )	2024-01-31 14:47:19 +04:00
Aurélien FOUCRET	2a6a4b1f06	Fix missing value support for XGBRanker. (#654 ) * Fix missing value support for XGBRanker. * lint * Sort expected scores * lint	2024-01-23 18:42:24 +01:00
Quentin Pradet	1190364abb	Release 8.12.0	2024-01-19 12:42:45 +04:00
David Kyle	64216d44fb	Add prefix_string config option to the import model hub script (#642 )	2024-01-19 12:06:57 +04:00
Aurélien FOUCRET	5169cc926a	Improve LTR (#651 ) * Ensure the feature logger is using NaN for non matching query feature extractors (consistent with ES). * Default score is None instead of 0. * LTR model import API improvements. * Fix feature logger tests. * Fix export in eland.ml.ltr * Apply suggestions from code review Co-authored-by: Adam Demjen <demjened@gmail.com> * Fix supported models for LTR --------- Co-authored-by: Adam Demjen <demjened@gmail.com>	2024-01-17 13:01:47 +04:00
Aurélien FOUCRET	d3ed669a5e	LTR feature logger (#648 )	2024-01-12 13:52:04 +01:00
Adam Demjen	926f0b9b5c	Add XGBRanker and transformer (#649 ) * Add XGBRanker and transformer * Map XGBoostRegressorTransformer to XGBRanker * Add unit tests * Remove unused import * Revert addition of type * Update function comment * Distinguish objective based on model class	2024-01-11 15:48:13 -05:00
Adam Demjen	840871f9d9	Accept LTR inference config when creating model (#645 ) * Support for supplying inference_config * Fix linting errors * Add unit test * Add LTR type, throw exception on predict, refine test * Add search step to LTR test * Fix linter errors * Update rescoring assertion in test + type defs * Fix linting error * Remove failing assertion	2024-01-08 09:19:03 -05:00
David Kyle	6ef418f465	Release 8.11.1	2023-11-22 11:55:53 +01:00
David Kyle	081250cdec	Fix failed import of ST RoBERTa models (#637 ) Fixes an error uploading the sentence-transformers/all-distilroberta-v1 model which failed with "missing 2 required positional arguments: 'token_type_ids' and 'position_ids'". The cause was that the tokenizer type was not recognised due to a typo	2023-11-21 12:53:43 +00:00
Quentin Pradet	41db37246f	Release 8.11.0	2023-11-08 11:51:14 +01:00
Valeriy Khakhutskyy	6cecb454e3	[ML] Better memory estimation for NLP models (#568 ) This PR adds an ability to estimate per deployment and per allocation memory usage of NLP transformer models. It uses torch.profiler and performs logs the peak memory usage during the inference. This information is then used in Elasticsearch to provision models with sufficient memory (elastic/elasticsearch#98874).	2023-11-06 12:18:20 +01:00
Bart Broere	28e6d92430	Stream writes in to_csv() Co-authored-by: P. Sai Vinay <pvinay1998@gmail.com>	2023-11-06 11:39:31 +01:00
David Kyle	5b3a83e7f2	[NLP] Support E5 small multi-lingual (#625 ) Although E5 small is a BERT based model it takes 2 parameters to forward not 4. Use the tokenizer type to decide the number of parameters	2023-10-31 17:49:43 +00:00
David Kyle	ab6e44f430	[NLP] Tests for NLP model configurations (#623 ) Add tests for generated Elasticsearch model configurations	2023-10-19 12:39:57 +01:00
Quentin Pradet	6a4fd511cc	Release 8.10.1 (#620 )	2023-10-11 12:56:24 +02:00
Quentin Pradet	c6ce4b2c46	Fix direct usage of TransformerModel (#619 )	2023-10-11 11:56:14 +02:00
Bart Broere	48e290a927	Prepare for deprecation of is_datetime_or_timedelta_dtype in Pandas 2.0 (#592 )	2023-10-10 19:37:13 +01:00
Quentin Pradet	bb0c111a68	Release Eland 8.10.0	2023-10-09 11:49:12 +02:00
Quentin Pradet	352e31ed14	Add Buildkite pipeline to push Docker image (#613 ) * Add Buildkite pipeline to push Docker image * Fix lint * Fix Read the Docs build * Replace distutils with packaging	2023-10-03 14:39:54 +02:00
Quentin Pradet	566bb9e990	Allow importing private HuggingFace models (#608 )	2023-09-25 15:10:58 +02:00
Jonathan Buttner	a8b76c390f	Setting chunk size to 1mb (#605 )	2023-09-20 11:40:11 -04:00
Bart Broere	12200039f5	Fix iteritems deprecation (#593 )	2023-09-19 12:00:32 +02:00
David Kyle	301cda8d69	Error measuring embedding size for some DPR models (#573 ) Fixes an error unpacking a tuple that contains a single element.	2023-09-19 10:44:15 +01:00
Bart Broere	5c5ef63a69	Use the workaround if we can't determine the server's version (#581 )	2023-09-15 15:29:36 +04:00
Enrico Zimuel	ac8c7c341e	Readded author info	2023-08-24 11:18:17 +02:00
Enrico Zimuel	ebdebdf16f	Prep for 8.9.0 release	2023-08-24 11:11:48 +02:00
Enrico Zimuel	932092c0e5	Fixed test for mean using ES 8.9.0	2023-08-24 10:46:14 +02:00
Josh Devins	f26fb8a430	Simplify embedding model support and loading (#569 ) We were attempting to load SentenceTransformers by looking at the model prefix, however SentenceTransformers can also be loaded from other orgs in the model hub, as well as from local disk. This prefix checking failed in those two cases. To simplify the loading logic and deciding which wrapper to use, we’ve removed support for text_embedding tasks to load a plain Transformer. We now only support DPR embedding models and SentenceTransformer embedding models. If you try to load a plain Transformer model, it will be loaded by SentenceTransformers and a mean pooling layer will automatically be added by the SentenceTransformer library. Since we no longer automatically support non-DPR and non-SentenceTransformers, we should include somewhere example code for how to load a custom model without DPR or SentenceTransformers. See: https://github.com/UKPLab/sentence-transformers/blob/v2.2.2/sentence_transformers/SentenceTransformer.py#L801 Resolves #531	2023-07-31 18:18:46 +02:00
Youhei Sakurai	4cf92fd9b7	Make eland_import_hub_model easier to find on Windows. (#559 )	2023-07-20 09:24:35 +01:00
Valeriy Khakhutskyy	77781b90ff	[ML] Update trained model inference endpoint (#556 ) Infer trained model deployment API has been deprecated, so I changed the code to use the new one.	2023-07-11 10:55:11 +02:00

1 2 3 4 5 ...

334 Commits