eland

mirror of https://github.com/elastic/eland.git synced 2025-07-11 00:02:14 +08:00

Author	SHA1	Message	Date
Benjamin Trent	8892f4fd64	[ML] adds new auto task type that attempts to automatically determine NLP task type from model config (#475 ) For many model types, we don't need to require the task requested. We can infer the task type based on the model configuration and architecture. This commit makes the `task-type` parameter optional for the model up load script and adds logic for auto-detecting the task type based on the 🤗 model.	2022-06-23 08:32:23 -04:00
Benjamin Trent	fa30246937	[ML] fixes decision tree classifier upload to account for probabilities (#465 ) This switches our sklearn.DecisionTreeClassifier serialization logic to account for multi-valued leaves in the tree. The key difference between our inference and DecisionTreeClassifier, is that we run a softMax over the leaf where sklearn simply normalizes the results. This means that our "probabilities" returned will be different than sklearn.	2022-05-17 08:11:20 -04:00
Benjamin Trent	650e02d16e	[ML] improve general pytorch model import and add tests (#463 ) This improves the user consumed functions and classes for PyTorch NLP model upload to Elasticsearch. Previously it was difficult to wrap your own module for uploading to Elasticsearch. This commit splits some classes out, adds new ones, and adds tests showing how to wrap some simple modules.	2022-05-05 10:50:53 -04:00
Benjamin Trent	afe08f8107	[ML] Improve NLP model import by using nicely defined types (#459 ) This adds some more definite types for our NLP tasks and tokenization configurations. This is the first step in allowing users to more easily import their own transformer models via something other than hugging face.	2022-05-03 15:19:03 -04:00
P. Sai Vinay	76a52b7947	Add support for eland.Series.unqiue()	2022-03-31 08:33:15 -05:00
Ashton Sidhu	e3bff8a623	Add option to disable schema enforcement for `pandas_to_eland`	2022-01-14 07:35:58 -06:00
Benjamin Trent	72856e2c3f	[ML] Add support for MPNet PyTorch models	2022-01-10 11:21:30 -06:00
Ashton Sidhu	64daa07a65	Using the 'date' field for datetime64+timezone columns	2022-01-04 22:03:49 -06:00
Florian Winkler	3db93cd789	Allow using datetime types in filters	2022-01-04 14:46:18 -06:00
Seth Michael Larson	ffe7c792dc	Update Notebook examples for 8.0	2021-12-15 16:01:32 -06:00
Seth Michael Larson	cd0897f5d7	Add a warning when connecting to incompatible Elasticsearch versions	2021-12-15 14:08:20 -06:00
Seth Michael Larson	109387184a	Support the v8.0 Elasticsearch client	2021-12-09 15:01:26 -06:00
Benjamin Trent	a3b0907c5b	[ML] Add inference results tests for PyTorch transformer models	2021-11-10 06:50:10 -06:00
Seth Michael Larson	66e3e4eaad	Set 'script.max_compilations_rate: use-context'	2021-11-02 10:09:25 -04:00
P. Sai Vinay	bc201e22dd	Improve coverage for eland.dataframe	2021-09-28 15:11:57 -05:00
Jabin Kong	1aa193da9e	Add `iterrows()` and `itertuples()` to DataFrame Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>	2021-08-20 08:34:52 -05:00
P. Sai Vinay	011bf29816	Simplify ES->pandas logic by removing Collectors	2021-08-16 12:22:02 -05:00
P. Sai Vinay	30876c8899	Switch to Point-in-Time with search_after instead of using scroll APIs Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>	2021-08-07 16:05:33 -05:00
P. Sai Vinay	8f84a315be	Add test case for pseudohubererror for XGBoost	2021-08-06 15:59:48 -05:00
P. Sai Vinay	4c1af42c14	Add idxmax and idxmin methods to DataFrame	2021-07-28 07:55:26 -05:00
P. Sai Vinay	ac2efb5863	Optimize df.describe() to use aggregations instead of own query	2021-06-22 11:29:54 -05:00
P. Sai Vinay	5fe32a24df	Add quantile() to DataFrameGroupBy	2021-06-22 10:54:33 -05:00
P. Sai Vinay	7e8520a8ef	Remove deprecated code in XGBoost and test suite	2021-06-08 15:19:56 -05:00
P. Sai Vinay	e9c0b897f5	Add quantile() to DataFrame and Series	2021-06-08 13:02:44 -05:00
P. Sai Vinay	aa9d60e7e7	Add sort order to groupby dropna=False (#322 ) * Add sort order to groupby dropna=False * Fix rebase	2021-04-21 13:24:52 +00:00
Stephen Dodson	1040160451	Fix bugs with field mapping and lint issue (#346 ) * Fix bugs with field mapping: 1. If no permission to call _mapping, return readable error 2. If index is wildcard, fix issues with user warnings * Fixing lint issues * Removing trailing backslashes in doc * Remove pandas/matplotlib deprecation warning This warning is due to a conflict between pandas/matplotlib that may be resolved in a later version. For now, ignore the warning so CI works.	2021-03-30 14:49:54 +00:00
P. Sai Vinay	421d84fd20	Add mode() method to DataFrame and Series	2021-01-07 12:17:10 -06:00
P. Sai Vinay	27717eead1	Remove deprecated options and aliases	2021-01-04 13:20:45 -06:00
Seth Michael Larson	a552504f9b	Add support for Pandas 1.2.0	2020-12-30 14:20:36 -06:00
P. Sai Vinay	473db4576b	Move tests directory outside of eland namespace	2020-11-16 11:30:41 -06:00

30 Commits