eland

mirror of https://github.com/elastic/eland.git synced 2025-07-11 00:02:14 +08:00

Author	SHA1	Message	Date
David Kyle	081250cdec	Fix failed import of ST RoBERTa models (#637 ) Fixes an error uploading the sentence-transformers/all-distilroberta-v1 model which failed with "missing 2 required positional arguments: 'token_type_ids' and 'position_ids'". The cause was that the tokenizer type was not recognised due to a typo	2023-11-21 12:53:43 +00:00
Quentin Pradet	af26897313	Bumpy numpy and shap (#636 )	2023-11-21 13:17:53 +01:00
David Kyle	add61a69ec	Update CI machine types to N2 (#634 ) Use `n2-standard-2` for lint and doc builds Use `n2-standard-4` for tests	2023-11-21 11:33:04 +00:00
David Kyle	b689759278	Skip model config tests (#635 ) For #633	2023-11-21 11:07:55 +00:00
Liam Thompson	87d18bd850	Fix colab link (#632 ) Co-authored-by: Quentin Pradet <quentin.pradet@elastic.co>	2023-11-16 10:24:06 +00:00
Quentin Pradet	dfc522eb31	Allow es-doc members to trigger CI (#631 )	2023-11-13 11:55:39 +01:00
Liam Thompson	508de981ff	Make demo notebook runnable in Colab (#630 ) * Make demo notebook runnable in Colab * Index using IDs starting from 0 * Trivial change to trigger CI	2023-11-10 08:44:19 +01:00
Quentin Pradet	41db37246f	Release 8.11.0 v8.11.0	2023-11-08 11:51:14 +01:00
Valeriy Khakhutskyy	6cecb454e3	[ML] Better memory estimation for NLP models (#568 ) This PR adds an ability to estimate per deployment and per allocation memory usage of NLP transformer models. It uses torch.profiler and performs logs the peak memory usage during the inference. This information is then used in Elasticsearch to provision models with sufficient memory (elastic/elasticsearch#98874).	2023-11-06 12:18:20 +01:00
Bart Broere	28e6d92430	Stream writes in to_csv() Co-authored-by: P. Sai Vinay <pvinay1998@gmail.com>	2023-11-06 11:39:31 +01:00
Quentin Pradet	adf0535608	Fix docs build Some dependencies like numpy are pinned to versions that do not support Python 3.12. Python 3.10 is the latest version supported by Eland.	2023-11-06 13:25:30 +04:00
Bart Broere	5e5f36bdf8	Deal with the mad aggregation being removed in Pandas 2 (#602 )	2023-11-06 06:12:16 +01:00
David Kyle	5b3a83e7f2	[NLP] Support E5 small multi-lingual (#625 ) Although E5 small is a BERT based model it takes 2 parameters to forward not 4. Use the tokenizer type to decide the number of parameters	2023-10-31 17:49:43 +00:00
David Kyle	ab6e44f430	[NLP] Tests for NLP model configurations (#623 ) Add tests for generated Elasticsearch model configurations	2023-10-19 12:39:57 +01:00
Quentin Pradet	0c0a8ab19f	Bump tested stack versions (#621 )	2023-10-11 19:48:47 +02:00
Bart Broere	36b941e336	Use _append instead of append since it's still available after 2.0 of pandas (#603 )	2023-10-11 15:41:05 +01:00
Quentin Pradet	6a4fd511cc	Release 8.10.1 (#620 ) v8.10.1	2023-10-11 12:56:24 +02:00
Quentin Pradet	c6ce4b2c46	Fix direct usage of TransformerModel (#619 )	2023-10-11 11:56:14 +02:00
Bart Broere	48e290a927	Prepare for deprecation of is_datetime_or_timedelta_dtype in Pandas 2.0 (#592 )	2023-10-10 19:37:13 +01:00
Quentin Pradet	bb0c111a68	Release Eland 8.10.0 v8.10.0	2023-10-09 11:49:12 +02:00
Quentin Pradet	9273636026	Reduce Docker image size and support arm64 (#615 ) Co-authored-by: David Olaru <dolaru@elastic.co> * Reduce Docker image size from 4.8GB to 2.2GB * Use torch+cpu variant if target platform is linux/amd64 Avoids downloading large & unnecessary NVIDIA deps defined in the package on PyPI * Build linux/arm64 image using buildx and QEMU	2023-10-05 18:43:52 +04:00
Quentin Pradet	b8a7b60c03	Stop mentioning Python 3.7 and Pandas 1.13 are supported (#612 )	2023-10-04 10:56:51 +02:00
Quentin Pradet	3be610b6fc	Recommend using pre-built Docker image (#614 ) * Recommend using pre-built Docker image * Update README.md Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2023-10-03 19:40:24 +02:00
Quentin Pradet	352e31ed14	Add Buildkite pipeline to push Docker image (#613 ) * Add Buildkite pipeline to push Docker image * Fix lint * Fix Read the Docs build * Replace distutils with packaging	2023-10-03 14:39:54 +02:00
Quentin Pradet	9d7c042bdb	Bump transformers to fix private model support (#611 )	2023-09-26 14:54:23 +02:00
Enrico Zimuel	235c490e0c	Updated bullseye docker image (#610 )	2023-09-26 09:53:24 +02:00
Bart Broere	3908f43905	Remove deprecated check_less_precise (#596 )	2023-09-26 07:34:52 +02:00
Quentin Pradet	566bb9e990	Allow importing private HuggingFace models (#608 )	2023-09-25 15:10:58 +02:00
Quentin Pradet	5ec760635b	Recommend installing Eland in a virtual environment (#606 )	2023-09-22 13:14:05 +02:00
Jonathan Buttner	a8b76c390f	Setting chunk size to 1mb (#605 )	2023-09-20 11:40:11 -04:00
Bart Broere	12200039f5	Fix iteritems deprecation (#593 )	2023-09-19 12:00:32 +02:00
David Kyle	301cda8d69	Error measuring embedding size for some DPR models (#573 ) Fixes an error unpacking a tuple that contains a single element.	2023-09-19 10:44:15 +01:00
Bart Broere	5c5ef63a69	Use the workaround if we can't determine the server's version (#581 )	2023-09-15 15:29:36 +04:00
Quentin Pradet	eb69496627	Add dummy pipeline to prepare publishing a Docker image (#590 )	2023-09-06 07:12:06 +02:00
Quentin Pradet	64ffbcec0f	Revert "Update Docker image to Debian 12 Bookworm (#586 )" (#588 )	2023-09-05 12:36:42 +04:00
Quentin Pradet	4d2c6e2f4d	Fix Buildkite builds on pull requests (#589 )	2023-09-05 12:20:24 +04:00
Quentin Pradet	ea4c2d1251	Fix downloads badge URL (#587 )	2023-09-05 11:57:36 +04:00
Quentin Pradet	c7a58e3783	Fix README so that copy/pastes work without warnings (#584 )	2023-09-05 11:56:25 +04:00
Quentin Pradet	0be509730a	Update Docker image to Debian 12 Bookworm (#586 )	2023-09-04 19:28:38 +04:00
David Kyle	95864a9ace	Update README.md with note about installing extras for NLP (#582 )	2023-08-31 10:34:36 +01:00
Enrico Zimuel	f14bbaf4b0	Added build and twine to requirements-dev	2023-08-24 16:02:12 +02:00
Enrico Zimuel	ac8c7c341e	Readded author info v8.9.0	2023-08-24 11:18:17 +02:00
Enrico Zimuel	2304fdc593	Updated docs	2023-08-24 11:12:30 +02:00
Enrico Zimuel	ebdebdf16f	Prep for 8.9.0 release	2023-08-24 11:11:48 +02:00
Enrico Zimuel	932092c0e5	Fixed test for mean using ES 8.9.0	2023-08-24 10:46:14 +02:00
Enrico Zimuel	08b7fac32b	Updated test to ES 8.9-SNAPSHOT	2023-08-23 13:53:15 +02:00
Enrico Zimuel	bb59a4f8d6	Fixed conf test with isinstance	2023-08-22 13:23:23 +02:00
Josh Devins	f26fb8a430	Simplify embedding model support and loading (#569 ) We were attempting to load SentenceTransformers by looking at the model prefix, however SentenceTransformers can also be loaded from other orgs in the model hub, as well as from local disk. This prefix checking failed in those two cases. To simplify the loading logic and deciding which wrapper to use, we’ve removed support for text_embedding tasks to load a plain Transformer. We now only support DPR embedding models and SentenceTransformer embedding models. If you try to load a plain Transformer model, it will be loaded by SentenceTransformers and a mean pooling layer will automatically be added by the SentenceTransformer library. Since we no longer automatically support non-DPR and non-SentenceTransformers, we should include somewhere example code for how to load a custom model without DPR or SentenceTransformers. See: https://github.com/UKPLab/sentence-transformers/blob/v2.2.2/sentence_transformers/SentenceTransformer.py#L801 Resolves #531	2023-07-31 18:18:46 +02:00
Fernando Briano	7ad1f430e4	[CI] Adds buildkite pull requests configuration (#570 )	2023-07-26 13:43:40 +01:00
Youhei Sakurai	4cf92fd9b7	Make eland_import_hub_model easier to find on Windows. (#559 )	2023-07-20 09:24:35 +01:00

1 2 3 4 5 ...

559 Commits