mirror of
https://github.com/elastic/eland.git
synced 2025-07-11 00:02:14 +08:00
[DOCS] Adds instructions on model install in air-gapped env (#542)
Co-authored-by: David Kyle <david.kyle@elastic.co>
This commit is contained in:
parent
1e6f48f8f4
commit
e0c08e42a0
@ -39,12 +39,12 @@ model in {es}.
|
||||
=== Natural language processing (NLP) with PyTorch
|
||||
|
||||
|
||||
IMPORTANT: You need to use PyTorch `1.11.0` or earlier to import an NLP model.
|
||||
Run `pip install torch==1.11` to install the aproppriate version of PyTorch.
|
||||
IMPORTANT: You need to use PyTorch `1.13` or earlier to import an NLP model.
|
||||
Run `pip install torch==1.13` to install the aproppriate version of PyTorch.
|
||||
|
||||
For NLP tasks, Eland enables you to import PyTorch trained BERT models into {es}.
|
||||
Models can be either plain PyTorch models, or supported
|
||||
https://huggingface.co/transformers[transformers] models from the
|
||||
For NLP tasks, Eland enables you to import PyTorch models into {es}. Use the
|
||||
`eland_import_hub_model` script to download and install supported
|
||||
https://huggingface.co/transformers[transformer models] from the
|
||||
https://huggingface.co/models[Hugging Face model hub]. For example:
|
||||
|
||||
[source,bash]
|
||||
@ -61,32 +61,6 @@ $ eland_import_hub_model <authentication> \ <1>
|
||||
<4> Specify the type of NLP task. Supported values are `fill_mask`, `ner`,
|
||||
`question_answering`, `text_classification`, `text_embedding`, and `zero_shot_classification`.
|
||||
|
||||
[source,python]
|
||||
------------------------
|
||||
>>> import elasticsearch
|
||||
>>> from pathlib import Path
|
||||
>>> from eland.ml.pytorch import PyTorchModel
|
||||
>>> from eland.ml.pytorch.transformers import TransformerModel
|
||||
|
||||
# Load a Hugging Face transformers model directly from the model hub
|
||||
>>> tm = TransformerModel("elastic/distilbert-base-cased-finetuned-conll03-english", "ner")
|
||||
Downloading: 100%|██████████| 257/257 [00:00<00:00, 108kB/s]
|
||||
Downloading: 100%|██████████| 954/954 [00:00<00:00, 372kB/s]
|
||||
Downloading: 100%|██████████| 208k/208k [00:00<00:00, 668kB/s]
|
||||
Downloading: 100%|██████████| 112/112 [00:00<00:00, 43.9kB/s]
|
||||
Downloading: 100%|██████████| 249M/249M [00:23<00:00, 11.2MB/s]
|
||||
|
||||
# Export the model in a TorchScript representation which Elasticsearch uses
|
||||
>>> tmp_path = "models"
|
||||
>>> Path(tmp_path).mkdir(parents=True, exist_ok=True)
|
||||
>>> model_path, config, vocab_path = tm.save(tmp_path)
|
||||
|
||||
# Import model into Elasticsearch
|
||||
>>> es = elasticsearch.Elasticsearch("http://elastic:mlqa_admin@localhost:9200", timeout=300) # 5 minute timeout
|
||||
>>> ptm = PyTorchModel(es, tm.elasticsearch_model_id())
|
||||
>>> ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config)
|
||||
100%|██████████| 63/63 [00:12<00:00, 5.02it/s]
|
||||
------------------------
|
||||
|
||||
[discrete]
|
||||
[[ml-nlp-pytorch-docker]]
|
||||
@ -118,6 +92,68 @@ docker run -it --rm elastic/eland \
|
||||
|
||||
Replace the `$ELASTICSEARCH_URL` with the URL for your Elasticsearch cluster. For authentication purposes, include an administrator username and password in the URL in the following format: `https://username:password@host:port`.
|
||||
|
||||
[discrete]
|
||||
[[ml-nlp-pytorch-air-gapped]]
|
||||
==== Install models in an air-gapped environment
|
||||
|
||||
You can install models in a restricted or closed network by pointing the
|
||||
`eland_import_hub_model` script to local files.
|
||||
|
||||
For an offline install of a Hugging Face model, the model first needs to be
|
||||
cloned locally, Git and https://git-lfs.com/[Git Large File Storage] are
|
||||
required to be installed in your system.
|
||||
|
||||
1. Select a model you want to use from Hugging Face. Refer to the
|
||||
{ml-docs}/ml-nlp-model-ref.html[compatible third party model] list for more
|
||||
information on the supported architectures.
|
||||
|
||||
2. Clone the selected model from Hugging Face by using the model URL. For
|
||||
example:
|
||||
+
|
||||
--
|
||||
[source,bash]
|
||||
----
|
||||
git clone https://huggingface.co/dslim/bert-base-NER
|
||||
----
|
||||
This command results in a local copy of
|
||||
of the model in the directory `bert-base-NER`.
|
||||
--
|
||||
|
||||
3. Use the `eland_import_hub_model` script with the `--hub-model-id` set to the
|
||||
directory of the cloned model to install it:
|
||||
+
|
||||
--
|
||||
[source,bash]
|
||||
----
|
||||
eland_import_hub_model \
|
||||
--url 'XXXX' \
|
||||
--hub-model-id /PATH/TO/MODEL \
|
||||
--task-type ner \
|
||||
--es-username elastic --es-password XXX \
|
||||
--es-model-id bert-base-ner
|
||||
----
|
||||
|
||||
If you use the Docker image to run `eland_import_hub_model` you must bind mount
|
||||
the model directory, so the container can read the files:
|
||||
|
||||
[source,bash]
|
||||
----
|
||||
docker run --mount type=bind,source=/PATH/TO/MODELS,destination=/models,readonly -it --rm elastic/eland \
|
||||
eland_import_hub_model \
|
||||
--url 'XXXX' \
|
||||
--hub-model-id /models/bert-base-NER \
|
||||
--task-type ner \
|
||||
--es-username elastic --es-password XXX \
|
||||
--es-model-id bert-base-ner
|
||||
----
|
||||
Once it's uploaded to {es}, the model will have the ID specified by
|
||||
`--es-model-id`. If it is not set, the model ID is derived from
|
||||
`--hub-model-id`; spaces and path delimiters are converted to double
|
||||
underscores `__`.
|
||||
|
||||
--
|
||||
|
||||
|
||||
[discrete]
|
||||
[[ml-nlp-pytorch-auth]]
|
||||
==== Authentication methods
|
||||
|
Loading…
x
Reference in New Issue
Block a user