Embeddings Model Download¶

We also need to download an embeddings model for vector search. An embeddings model converts text into a list of numbers (a vector) so that similar passages end up close together in that number space. This is the core of RAG (Retrieval-Augmented Generation): you embed your documents once, then embed each question and find the closest matching passages.

Model	Repo ID	Use via	Size	Notes
all-MiniLM-L6-v2	`sentence-transformers/all-MiniLM-L6-v2`	`sentence-transformers`	~90 MB	Fast, battle-tested, widely used in RAG tutorials
Gemma Embedding	google/embedding-gemma collection	`sentence-transformers`	~300 MB	Google’s dedicated embedding models, newer and higher quality

all-MiniLM-L6-v2 is downloaded automatically by the sentence-transformers library the first time you use it — no manual hf_hub_download call needed. The Gemma Embedding models are also loaded via sentence-transformers using their Hugging Face repo ID directly.

Download Embedding Models to Shared Directory¶

Embedding models are full model repos (not single GGUF files), so we use snapshot_download instead of hf_hub_download. This downloads all the files the model needs into the shared directory so students do not have to wait for a download when they first run the RAG notebook. from huggingface_hub import snapshot_download

Let’s check out our local filesystem path and where we will download the files¶

Approach 1 - If a Shared Hub is being used¶

# Cloudbank workshop Hub specific path
!ls /home/jovyan/shared

Approach 2 - If a local machine is being used¶

# This is my local path to a directory called shared-rw
!ls /home/jovyan/shared/

# or the full path (this is on my laptop)
!ls /Users/ericvandusen/SmallLM/Models/

Set the path where the models will download¶

# Path for Shared Hub - change this to match your JupyterHub's shared directory
# Examples: /home/jovyan/shared, /home/jovyan/shared_readwrite, /home/jovyan/_shared/course-name
shared_model_path = "/home/jovyan/shared"

# Path for Local
shared_model_path = "/Users/ericvandusen/SmallLM/Models/"

#import os
#os.environ["HF_TOKEN"] = "hf_your_token_here"

from huggingface_hub import snapshot_download
# Download all-MiniLM-L6-v2 to the shared directory
minilm_path = snapshot_download(
    repo_id="sentence-transformers/all-MiniLM-L6-v2",
    local_dir=shared_model_path + "/all-MiniLM-L6-v2",
    local_dir_use_symlinks=False
)
print("all-MiniLM-L6-v2 downloaded to:", minilm_path)

# Download Gemma Embedding to the shared directory
gemma_embed_path = snapshot_download(
    repo_id="google/gemma-embedding-001",
    local_dir=shared_model_path + "/gemma-embedding-001",
    local_dir_use_symlinks=False
)
print("gemma-embedding-001 downloaded to:", gemma_embed_path)