334 Commits

Author SHA1 Message Date
P. Sai Vinay
f9d2defb1b
Add number_samples to sklearn MLModel 2021-10-07 08:14:54 -05:00
Josh Devins
014943d3b8
Add initial implementation of PyTorch ML models 2021-10-06 08:44:40 -05:00
P. Sai Vinay
995f2432b6
Add number_samples to LightGBM MLModel and leaf_count to leaf nodes
* Add number_samples to lightgbm ML Model

* Add leaf_count for leaf nodes
2021-09-30 08:13:44 -05:00
P. Sai Vinay
dabb327b8b
Refactor df.info() for better readability 2021-09-28 15:12:29 -05:00
P. Sai Vinay
f241ae971a
Add flynt and --cov-report=term-missing 2021-09-21 11:18:01 -05:00
Seth Michael Larson
7aabc88e4a
Rename 'master' branch to 'main' 2021-09-08 11:51:50 -05:00
Jabin Kong
77f9a455e9
Fix docstring formatting 2021-09-07 11:40:19 -05:00
P. Sai Vinay
315f94b201
Add excluded lines for coverage and improve coverage 2021-09-07 11:39:19 -05:00
Seth Michael Larson
a50c3657c4
Release v7.14.1b1 2021-08-30 13:42:55 -05:00
Jabin Kong
1aa193da9e
Add iterrows() and itertuples() to DataFrame
Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
2021-08-20 08:34:52 -05:00
Seth Michael Larson
e4f88a34a6
Yield list of hits from _search_yield_hits() instead of individual hits 2021-08-17 12:16:10 -05:00
P. Sai Vinay
011bf29816
Simplify ES->pandas logic by removing Collectors 2021-08-16 12:22:02 -05:00
Seth Michael Larson
76d83ea47f
Bump version to 7.14.0b1 2021-08-09 09:21:49 -05:00
Seth Michael Larson
15ba8d3e02
Fallback on using scroll searches for Elasticsearch <7.12
PIT+search_after became universally safe in Elasticsearch 7.12 by adding an automatic sort tiebreaker field when using PITs called `_shard_doc` but now we need to do feature detection to make sure we use the previous scroll method on Elasticsearch <7.12 clusters
2021-08-08 12:19:41 -05:00
P. Sai Vinay
30876c8899
Switch to Point-in-Time with search_after instead of using scroll APIs
Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
2021-08-07 16:05:33 -05:00
P. Sai Vinay
d3f8d7b8f6
Optimize FieldMappings.aggregate_field_name() method 2021-08-06 11:27:59 -05:00
P. Sai Vinay
823f01cc6c
Add type hints to 'eland.operations' and 'eland.ndframe' 2021-08-02 11:50:35 -05:00
P. Sai Vinay
4c1af42c14
Add idxmax and idxmin methods to DataFrame 2021-07-28 07:55:26 -05:00
Seth Michael Larson
c74fccbd74
Drop support for Python 3.6, pandas<1.2 2021-07-27 14:43:03 -05:00
P. Sai Vinay
193bcb73ef
Add support for Pandas v1.3 and LightGBM v3.x 2021-07-27 11:01:35 -05:00
Seth Michael Larson
1555ea9534
Fix typo in version number
Should be `7.13.0b1` instead of `7.13.1b1`
2021-06-22 12:03:46 -05:00
Seth Michael Larson
16178dfb5d
Release 7.13.0b1 2021-06-22 11:59:27 -05:00
P. Sai Vinay
ac2efb5863
Optimize df.describe() to use aggregations instead of own query 2021-06-22 11:29:54 -05:00
P. Sai Vinay
5fe32a24df
Add quantile() to DataFrameGroupBy 2021-06-22 10:54:33 -05:00
P. Sai Vinay
7e8520a8ef
Remove deprecated code in XGBoost and test suite 2021-06-08 15:19:56 -05:00
P. Sai Vinay
e9c0b897f5
Add quantile() to DataFrame and Series 2021-06-08 13:02:44 -05:00
P. Sai Vinay
aa9d60e7e7
Add sort order to groupby dropna=False (#322)
* Add sort order to groupby dropna=False

* Fix rebase
2021-04-21 13:24:52 +00:00
Stephen Dodson
1040160451
Fix bugs with field mapping and lint issue (#346)
* Fix bugs with field mapping:

1. If no permission to call _mapping, return readable error
2. If index is wildcard, fix issues with user warnings

* Fixing lint issues

* Removing trailing backslashes in doc

* Remove pandas/matplotlib deprecation warning

This warning is due to a conflict between
pandas/matplotlib that may be resolved in a later
version. For now, ignore the warning so CI works.
2021-03-30 14:49:54 +00:00
Seth Michael Larson
985afe74e0
Release 7.10.1b1 2021-01-12 12:36:23 -06:00
P. Sai Vinay
421d84fd20
Add mode() method to DataFrame and Series 2021-01-07 12:17:10 -06:00
P. Sai Vinay
27717eead1
Remove deprecated options and aliases 2021-01-04 13:20:45 -06:00
Seth Michael Larson
a552504f9b
Add support for Pandas 1.2.0 2020-12-30 14:20:36 -06:00
P. Sai Vinay
473db4576b
Move tests directory outside of eland namespace 2020-11-16 11:30:41 -06:00
P. Sai Vinay
56f6ba6c8b
Add Elasticsearch storage usage to df.info() 2020-11-16 10:07:28 -06:00
P. Sai Vinay
789f8959bc
Add support for pd.set_option("display.max_rows", None) 2020-11-06 12:23:09 -06:00
Seth Michael Larson
31760fe02c
Release 7.10.0b1 2020-10-29 13:43:34 -05:00
Seth Michael Larson
b936e98012
Allow dict in es_type_overrides, text fields by default get keyword sub-field 2020-10-29 13:16:42 -05:00
Seth Michael Larson
cb4cd083c3
Add support for es_match() to DataFrame and Series 2020-10-29 10:16:50 -05:00
Seth Michael Larson
95b8d75e37
Fix 'Series.__repr__()' when the series is empty 2020-10-27 17:08:37 -05:00
P. Sai Vinay
54468cb85b
Add pytest --nbval of notebook examples to CI 2020-10-27 15:15:04 -05:00
P. Sai Vinay
e17b4e03ea
Error when es_type_overrides receives unknown columns 2020-10-27 13:48:31 -05:00
Seth Michael Larson
ae70f03df3
Document DataFrame.groupby() methods 2020-10-27 10:10:57 -05:00
P. Sai Vinay
475e0f41ef
Implement DataFrameGroupBy.count() 2020-10-23 08:41:50 -05:00
Seth Michael Larson
bd7956ea72
Support typed 'elasticsearch-py' and add 'py.typed' 2020-10-20 16:26:58 -05:00
Seth Michael Larson
05a24cbe0b Add isort, rename Nox session to 'format' 2020-10-15 17:11:29 -05:00
Seth Michael Larson
18fb4af731 Document DataFrame.groupby() and rename Field.index -> .column 2020-10-15 17:11:29 -05:00
P. Sai Vinay
abc5ca927b
Add support for DataFrame.groupby() with aggregations 2020-10-15 10:52:48 -05:00
Seth Michael Larson
adafeed667
Add es_dtypes property to DataFrame and Series 2020-10-13 12:14:09 -05:00
P. Sai Vinay
b7c6c26606
Change DataFrame.filter() to preserve the order of items 2020-10-13 10:58:09 -05:00
P. Sai Vinay
0dd247b693
Improve efficiency of 'pandas_to_eland()' using 'parallel_bulk()' 2020-10-08 10:17:22 -05:00