[DOCS] Adds instructions on model install in air-gapped env (#542)

Co-authored-by: David Kyle <david.kyle@elastic.co>
2025-07-11 00:02:14 +08:00 · 2023-05-24 12:53:04 +02:00 · 2023-05-24 12:53:04 +02:00 · e0c08e42a0
commit e0c08e42a0
parent 1e6f48f8f4
1 changed files with 67 additions and 31 deletions
--- a/docs/guide/machine-learning.asciidoc
+++ b/docs/guide/machine-learning.asciidoc
@ -39,12 +39,12 @@ model in {es}.
 === Natural language processing (NLP) with PyTorch
-IMPORTANT: You need to use PyTorch `1.11.0` or earlier to import an NLP model. 
+IMPORTANT: You need to use PyTorch `1.13` or earlier to import an NLP model. 
-Run `pip install torch==1.11` to install the aproppriate version of PyTorch.
+Run `pip install torch==1.13` to install the aproppriate version of PyTorch.
-For NLP tasks, Eland enables you to import PyTorch trained BERT models into {es}. 
+For NLP tasks, Eland enables you to import PyTorch models into {es}. Use the 
-Models can be either plain PyTorch models, or supported 
+`eland_import_hub_model` script to download and install supported 
-https://huggingface.co/transformers[transformers] models from the
+https://huggingface.co/transformers[transformer models] from the
 https://huggingface.co/models[Hugging Face model hub]. For example:
 [source,bash]
@ -61,32 +61,6 @@ $ eland_import_hub_model <authentication> \ <1>
 <4> Specify the type of NLP task. Supported values are `fill_mask`, `ner`,
 `question_answering`, `text_classification`, `text_embedding`, and `zero_shot_classification`.
 [source,python]
 ------------------------
 >>> import elasticsearch
 >>> from pathlib import Path
 >>> from eland.ml.pytorch import PyTorchModel
 >>> from eland.ml.pytorch.transformers import TransformerModel
 # Load a Hugging Face transformers model directly from the model hub
 >>> tm = TransformerModel("elastic/distilbert-base-cased-finetuned-conll03-english", "ner")
 Downloading: 100%|██████████| 257/257 [00:00<00:00, 108kB/s]
 Downloading: 100%|██████████| 954/954 [00:00<00:00, 372kB/s]
 Downloading: 100%|██████████| 208k/208k [00:00<00:00, 668kB/s] 
 Downloading: 100%|██████████| 112/112 [00:00<00:00, 43.9kB/s]
 Downloading: 100%|██████████| 249M/249M [00:23<00:00, 11.2MB/s]
 # Export the model in a TorchScript representation which Elasticsearch uses
 >>> tmp_path = "models"
 >>> Path(tmp_path).mkdir(parents=True, exist_ok=True)
 >>> model_path, config, vocab_path = tm.save(tmp_path)
 # Import model into Elasticsearch
 >>> es = elasticsearch.Elasticsearch("http://elastic:mlqa_admin@localhost:9200", timeout=300)  # 5 minute timeout
 >>> ptm = PyTorchModel(es, tm.elasticsearch_model_id())
 >>> ptm.import_model(model_path=model_path, config_path=None, vocab_path=vocab_path, config=config)
 100%|██████████| 63/63 [00:12<00:00,  5.02it/s]
 ------------------------
 [discrete]
 [[ml-nlp-pytorch-docker]]
@ -118,6 +92,68 @@ docker run -it --rm elastic/eland \
 Replace the `$ELASTICSEARCH_URL` with the URL for your Elasticsearch cluster. For authentication purposes, include an administrator username and password in the URL in the following format: `https://username:password@host:port`.
 [discrete]
 [[ml-nlp-pytorch-air-gapped]]
 ==== Install models in an air-gapped environment 
 You can install models in a restricted or closed network by pointing the 
 `eland_import_hub_model` script to local files. 
 For an offline install of a Hugging Face model, the model first needs to be 
 cloned locally, Git and https://git-lfs.com/[Git Large File Storage] are 
 required to be installed in your system.
 1. Select a model you want to use from Hugging Face. Refer to the 
 {ml-docs}/ml-nlp-model-ref.html[compatible third party model] list for more 
 information on the supported architectures. 
 2. Clone the selected model from Hugging Face by using the model URL. For 
 example:
 +
 --
 [source,bash]
 ----
 git clone https://huggingface.co/dslim/bert-base-NER
 ----
 This command results in a local copy of 
 of the model in the directory `bert-base-NER`.
 --
 3. Use the `eland_import_hub_model` script with the `--hub-model-id` set to the 
 directory of the cloned model to install it:
 +
 --
 [source,bash]
 ----
 eland_import_hub_model \
      --url 'XXXX' \
      --hub-model-id /PATH/TO/MODEL \
      --task-type ner \
      --es-username elastic --es-password XXX \
      --es-model-id bert-base-ner
 ----
 If you use the Docker image to run `eland_import_hub_model` you must bind mount 
 the model directory, so the container can read the files:
 [source,bash]
 ----
 docker run --mount type=bind,source=/PATH/TO/MODELS,destination=/models,readonly -it --rm elastic/eland \
    eland_import_hub_model \
      --url 'XXXX' \
      --hub-model-id /models/bert-base-NER \
      --task-type ner \
      --es-username elastic --es-password XXX \
      --es-model-id bert-base-ner
 ----
 Once it's uploaded to {es}, the model will have the ID specified by 
 `--es-model-id`. If it is not set, the model ID is derived from 
 `--hub-model-id`; spaces and path delimiters are converted to double 
 underscores `__`.
 --
 [discrete]
 [[ml-nlp-pytorch-auth]]
 ==== Authentication methods