mirror of
https://github.com/elastic/eland.git
synced 2025-07-11 00:02:14 +08:00
Document how to install transitive binary dependencies, add repo Dockerfile
Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
This commit is contained in:
parent
19014f1227
commit
df51f8af07
14
Dockerfile
Normal file
14
Dockerfile
Normal file
@ -0,0 +1,14 @@
|
||||
FROM debian:11.1
|
||||
|
||||
RUN apt-get update && \
|
||||
apt-get install -y build-essential pkg-config cmake \
|
||||
python3-dev python3-pip python3-venv \
|
||||
libzip-dev libjpeg-dev && \
|
||||
apt-get clean
|
||||
|
||||
ADD . /eland
|
||||
WORKDIR /eland
|
||||
|
||||
RUN python3 -m pip install --no-cache-dir --disable-pip-version-check .[all]
|
||||
|
||||
CMD ["/bin/sh"]
|
53
README.md
53
README.md
@ -22,16 +22,16 @@
|
||||
|
||||
## About
|
||||
|
||||
Eland is a Python Elasticsearch client for exploring and
|
||||
analyzing data in Elasticsearch with a familiar Pandas-compatible API.
|
||||
Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch with a familiar
|
||||
Pandas-compatible API.
|
||||
|
||||
Where possible the package uses existing Python APIs and data structures to make it easy to switch between numpy,
|
||||
pandas, scikit-learn to their Elasticsearch powered equivalents. In general, the data resides in Elasticsearch and
|
||||
pandas, or scikit-learn to their Elasticsearch powered equivalents. In general, the data resides in Elasticsearch and
|
||||
not in memory, which allows Eland to access large datasets stored in Elasticsearch.
|
||||
|
||||
Eland also provides tools to upload trained machine learning models from your
|
||||
common libraries like [scikit-learn](https://scikit-learn.org), [XGBoost](https://xgboost.readthedocs.io),
|
||||
and [LightGBM](https://lightgbm.readthedocs.io) into Elasticsearch.
|
||||
Eland also provides tools to upload trained machine learning models from common libraries like
|
||||
[scikit-learn](https://scikit-learn.org), [XGBoost](https://xgboost.readthedocs.io), and
|
||||
[LightGBM](https://lightgbm.readthedocs.io) into Elasticsearch.
|
||||
|
||||
## Getting Started
|
||||
|
||||
@ -52,6 +52,47 @@ $ conda install -c conda-forge eland
|
||||
- Supports Python 3.7+ and Pandas 1.3
|
||||
- Supports Elasticsearch clusters that are 7.11+, recommended 7.14 or later for all features to work.
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Users installing Eland on Debian-based distributions may need to install prerequisite packages for the transitive
|
||||
dependencies of Eland:
|
||||
|
||||
```bash
|
||||
$ sudo apt-get install -y \
|
||||
build-essential pkg-config cmake \
|
||||
python3-dev libzip-dev libjpeg-dev
|
||||
```
|
||||
|
||||
Note that other distributions such as CentOS, RedHat, Arch, etc. may require using a different package manager and
|
||||
specifying different package names.
|
||||
|
||||
### Docker
|
||||
|
||||
Users wishing to use Eland without installing it, in order to just run the available scripts, can build the Docker
|
||||
container:
|
||||
|
||||
```bash
|
||||
$ docker build -t elastic/eland .
|
||||
```
|
||||
|
||||
The container can now be used interactively:
|
||||
|
||||
```bash
|
||||
$ docker run -it --rm --network host elastic/eland
|
||||
```
|
||||
|
||||
Running installed scripts is also possible without an interactive shell, e.g.:
|
||||
|
||||
```bash
|
||||
$ docker run -it --rm --network host \
|
||||
elastic/eland \
|
||||
eland_import_hub_model \
|
||||
--url http://host.docker.internal:9200/ \
|
||||
--hub-model-id elastic/distilbert-base-cased-finetuned-conll03-english \
|
||||
--task-type ner \
|
||||
--start
|
||||
```
|
||||
|
||||
### Connecting to Elasticsearch
|
||||
|
||||
Eland uses the [Elasticsearch low level client](https://elasticsearch-py.readthedocs.io) to connect to Elasticsearch.
|
||||
|
24
setup.py
24
setup.py
@ -54,6 +54,18 @@ with open(path.join(here, "README.md"), "r", "utf-8") as f:
|
||||
last_html_index = i + 1
|
||||
long_description = "\n".join(lines[last_html_index:])
|
||||
|
||||
extras = {
|
||||
"xgboost": ["xgboost>=0.90,<2"],
|
||||
"scikit-learn": ["scikit-learn>=0.22.1,<1"],
|
||||
"lightgbm": ["lightgbm>=2,<4"],
|
||||
"pytorch": [
|
||||
"huggingface-hub>=0.0.17,<1",
|
||||
"sentence-transformers>=2.0.0,<3",
|
||||
"torch>=1.9.0,<2",
|
||||
"transformers[torch]>=4.11.0<5",
|
||||
],
|
||||
}
|
||||
extras["all"] = list({dep for deps in extras.values() for dep in deps})
|
||||
|
||||
setup(
|
||||
name=about["__title__"],
|
||||
@ -81,15 +93,5 @@ setup(
|
||||
package_data={"eland": ["py.typed"]},
|
||||
include_package_data=True,
|
||||
zip_safe=False,
|
||||
extras_require={
|
||||
"xgboost": ["xgboost>=0.90,<2"],
|
||||
"scikit-learn": ["scikit-learn>=0.22.1,<1"],
|
||||
"lightgbm": ["lightgbm>=2,<4"],
|
||||
"pytorch": [
|
||||
"huggingface-hub>=0.0.17,<1",
|
||||
"sentence-transformers>=2.0.0,<3",
|
||||
"torch>=1.9.0,<2",
|
||||
"transformers[torch]>=4.11.0<5",
|
||||
],
|
||||
},
|
||||
extras_require=extras,
|
||||
)
|
||||
|
Loading…
x
Reference in New Issue
Block a user