Last active
June 5, 2025 00:34
-
-
Save dtaivpp/d7e8d8a3ee5debaf896ed2f45b915ad3 to your computer and use it in GitHub Desktop.
Revisions
-
dtaivpp revised this gist
May 16, 2024 . 1 changed file with 4 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,3 +1,7 @@ # Local Semantic Search in OpenSearch! If you'd like to watch this demo through it's available on [YouTube](https://youtu.be/lpQiJGpeeWU) or if you prefer reading there is a walkthrough on my [blog](https://tippybits.com/should-you-be-doing-vector-search/). ## Cluster settings for Amazon OpenSearch and Locally running ``` PUT /_cluster/settings -
dtaivpp revised this gist
May 16, 2024 . 1 changed file with 19 additions and 11 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,19 +1,18 @@ ## Cluster settings for Amazon OpenSearch and Locally running ``` PUT /_cluster/settings { "persistent": { "plugins.ml_commons.only_run_on_ml_node": false, "plugins.ml_commons.model_access_control_enabled": true, "plugins.ml_commons.native_memory_threshold": "99", "plugins.ml_commons.model_auto_redeploy.enable": true, "plugins.ml_commons.model_auto_redeploy.lifetime_retry_times": 3 } } ``` ## Register your model group ``` POST /_plugins/_ml/model_groups/_register { @@ -24,6 +23,7 @@ POST /_plugins/_ml/model_groups/_register #MODEL_GROUP: ``` ## Register your model ``` POST /_plugins/_ml/models/_register { @@ -35,19 +35,23 @@ POST /_plugins/_ml/models/_register # TASK_ID: ``` ### Check if model is downloaded ``` GET /_plugins/_ml/tasks/<TASK_ID> # MODEL_ID: ``` ## Deploy your model ``` POST /_plugins/_ml/models/<MODEL_ID>/_deploy ``` ### Check that model was deployed ``` GET /_plugins/_ml/tasks/<TASK_ID> ``` ### Testing the model ``` POST /_plugins/_ml/_predict/text_embedding/<MODEL_ID> { @@ -57,6 +61,7 @@ POST /_plugins/_ml/_predict/text_embedding/<MODEL_ID> } ``` ## Create and ingestion pipeline for embeddings ``` PUT _ingest/pipeline/embedding-ingest-pipeline { @@ -74,7 +79,7 @@ PUT _ingest/pipeline/embedding-ingest-pipeline } ``` ## Creating a hybrid search pipeline ``` ## Put the search pipeline in place PUT _search/pipeline/hybrid-search-pipeline @@ -100,6 +105,7 @@ PUT _search/pipeline/hybrid-search-pipeline } ``` ## Create the index ``` PUT /documents { @@ -127,6 +133,7 @@ PUT /documents } ``` ## Upload documents ``` POST /documents/_bulk { "index": {"_id": "1234" } } @@ -137,6 +144,7 @@ POST /documents/_bulk { "content": "Some may say that supercar drivers dont really mind risk"} ``` ## Search the documents with lexical and vector search ``` GET /documents/_search { -
dtaivpp created this gist
May 15, 2024 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,170 @@ ``` PUT _cluster/settings { "persistent": { "plugins": { "ml_commons": { "allow_registering_model_via_url": "true", "only_run_on_ml_node": "false", "model_access_control_enabled": "true", "native_memory_threshold": "99" } } } } ``` ``` POST /_plugins/_ml/model_groups/_register { "name": "Model_Group", "description": "Public ML Model Group", "access_mode": "public" } #MODEL_GROUP: ``` ``` POST /_plugins/_ml/models/_register { "name": "huggingface/sentence-transformers/msmarco-distilbert-base-tas-b", "version": "1.0.2", "model_group_id": "<MODEL_GROUP>", "model_format": "TORCH_SCRIPT" } # TASK_ID: ``` ``` GET /_plugins/_ml/tasks/<TASK_ID> # MODEL_ID: ``` ``` POST /_plugins/_ml/models/<MODEL_ID>/_deploy ``` ``` GET /_plugins/_ml/tasks/<TASK_ID> ``` ``` POST /_plugins/_ml/_predict/text_embedding/<MODEL_ID> { "text_docs":[ "This should get embedded"], "return_number": true, "target_response": ["sentence_embedding"] } ``` ``` PUT _ingest/pipeline/embedding-ingest-pipeline { "description": "Neural Search Pipeline", "processors" : [ { "text_embedding": { "model_id": "<MODEL_ID>", "field_map": { "content": "content_embedding" } } } ] } ``` ``` ## Put the search pipeline in place PUT _search/pipeline/hybrid-search-pipeline { "phase_results_processors": [ { "normalization-processor": { "normalization": { "technique": "min_max" }, "combination": { "technique": "arithmetic_mean", "parameters": { "weights": [ 0.3, 0.7 ] } } } } ] } ``` ``` PUT /documents { "settings": { "index.knn": true, "default_pipeline": "embedding-ingest-pipeline", "index.search.default_pipeline": "hybrid-search-pipeline" }, "mappings": { "properties": { "content_embedding": { "type": "knn_vector", "dimension": 768, "method": { "name": "hnsw", "space_type": "innerproduct", "engine": "nmslib" } }, "content": { "type": "text" } } } } ``` ``` POST /documents/_bulk { "index": {"_id": "1234" } } { "content": "There once was a racecar driver that was super fast"} { "index": {"_id": "1235" } } { "content": "The golf driver used by tiger woods is the TaylorMade Qi10 LS prototype"} { "index": {"_id": "1236" } } { "content": "Some may say that supercar drivers dont really mind risk"} ``` ``` GET /documents/_search { "_source": { "exclude": [ "content_embedding" ] }, "query": { "hybrid": { "queries": [ { "match": { "content": { "query": "sports automobile" } } }, { "neural": { "content_embedding": { "query_text": "sports cars", "model_id": "<MODEL_ID>", "k": 5 } } } ] } } ```