Skip to content

Instantly share code, notes, and snippets.

@tspannhw
Created October 20, 2024 14:27
Show Gist options
  • Select an option

  • Save tspannhw/2197f6a494ce1952c013b7b73624310c to your computer and use it in GitHub Desktop.

Select an option

Save tspannhw/2197f6a494ce1952c013b7b73624310c to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"id": "fa299086-bada-411c-a7ae-48e64f6f77dd",
"metadata": {},
"source": [
"# Goal of this Notebook\n",
"\n",
"In this notebook we use langchain to build a simple RAG to Ollama and we ask the llama3 model for \n",
"data from the slides context fed from Milvus.\n",
"\n",
"### 🚕 Simple Retrieval-Augmented Generation (RAG) with LangChain:\n",
"\n",
"Build a simple Python [RAG](https://milvus.io/docs/integrate_with_langchain.md) application to use Milvus for \n",
"asking about Tim's slides via OLLAMA. W=\n",
"\n",
"### 🔍 Summary\n",
"By the end of this application, you’ll have a comprehensive understanding of using Milvus, data ingest object semi-structured and unstructured data, and using Open Source models to build a robust and efficient data retrieval system. \n",
"\n",
"\n",
"#### 🐍 AIM Stack - Easy Local Free Open Source RAG\n",
"\n",
"* Ollama\n",
"* Python\n",
"* Jupyter Notebooks\n",
"* Llama 3.2\n",
"* Milvus-Lite\n",
"* LangChain\n",
"\n",
"![milvuslogo](https://milvus.io/images/milvus_logo.svg)\n",
"\n",
"#### 🎃 Resources\n",
"\n",
"* https://zilliz.com/blog/a-beginners-guide-to-using-llama-3-with-ollama-milvus-langchain\n",
"* https://github.com/stephen37/ollama_local_rag/blob/main/rag_berlin_parliament.py\n",
"* https://python.langchain.com/docs/integrations/vectorstores/milvus/\n",
"* https://api.python.langchain.com/en/latest/vectorstores/langchain_milvus.vectorstores.milvus.Milvus.html\n",
"* https://developer.nvidia.com/blog/rag-101-demystifying-retrieval-augmented-generation-pipelines/\n",
"* https://zilliz.com/learn/build-rag-with-milvus-lite-llama3-and-llamaindex\n",
" \n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "691c0102-ec92-4cf4-a285-99c5ca752e21",
"metadata": {},
"outputs": [],
"source": [
"import warnings\n",
"warnings.filterwarnings('ignore')\n",
"warnings.filterwarnings(\"ignore\", category=DeprecationWarning)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "00d5d49f-7f16-42ca-a900-188bed335ef9",
"metadata": {},
"outputs": [],
"source": [
"!pip install -qU sentence-transformers langchain langchain_milvus langchain-huggingface ollama langchain-ollama pypdf langchainhub \"pymilvus[model]\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "95662303-32b3-4844-aaca-dcbad17d0f06",
"metadata": {},
"source": [
"## 🔥 Install Ollama\n",
"\n",
"![ollama](https://ollama.com/public/ollama.png)\n",
"\n",
"### 🛩️ Download for Mac, Linux, Windows\n",
"\n",
"https://ollama.com/download\n",
"\n",
"### 👽 Install Open Source Llama 3.2 model from Meta\n",
"\n",
"https://ollama.com/library/llama3.2\n",
"\n",
"```\n",
"ollama run llama3.2\n",
"\n",
">>> /bye\n",
"```\n",
"\n",
"Running the model will download many gigabytes of model and weights for you. When it is complete it will put you interactive chat mode. \n",
"You can test it, or type */bye* to exit.\n",
"\n",
"\n",
"### 🙅 List all of your models\n",
"\n",
"````\n",
"ollama list\n",
"\n",
"NAME ID SIZE MODIFIED\n",
"llava:7b 8dd30f6b0cb1 4.7 GB 40 hours ago\n",
"mistral-nemo:latest 994f3b8b7801 7.1 GB 9 days ago\n",
"gemma2:2b 8ccf136fdd52 1.6 GB 10 days ago\n",
"nomic-embed-text:latest 0a109f422b47 274 MB 10 days ago\n",
"llama3.2:3b-instruct-fp16 195a8c01d91e 6.4 GB 2 weeks ago\n",
"llama3.2:latest a80c4f17acd5 2.0 GB 2 weeks ago\n",
"reader-lm:latest 33da2b9e0afe 934 MB 3 weeks ago\n",
"````\n",
"\n",
"### 📊 Let's use it\n",
"\n",
"### 🦙 First, let's import all the libraries we will need\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e196afc3-a057-497b-87fc-820453c7b5a6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from pymilvus import MilvusClient\n",
"from langchain_community.document_loaders import PyPDFLoader\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain.chains import RetrievalQA\n",
"from langchain_milvus import Milvus\n",
"from langchain_community.embeddings import HuggingFaceEmbeddings\n",
"from langchain.callbacks.manager import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain import hub\n",
"\n",
"#### Constant For PDF Downloads, If you Change This, Change in Section Below As Well\n",
"path_pdfs = \"talks/\"\n",
"\n",
"#### Initialize Our Documents\n",
"documents = []"
]
},
{
"cell_type": "markdown",
"id": "5012e20d-cf2b-4d99-ad6a-e90fcfa89f42",
"metadata": {},
"source": [
"### 🐦 Download some PDFs of talks\n",
"\n",
"1. Build a directory for the talks\n",
"2. Download PDFs\n",
"\n",
"#### Note:\n",
"\n",
"You can use your own PDFs, download more from \n",
"\n",
"* https://github.com/tspannhw/SpeakerProfile\n",
"* https://www.slideshare.net/bunkertor/presentations\n"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "2f41ba59-d459-405d-9cf3-992c312548ae",
"metadata": {},
"outputs": [],
"source": [
"!mkdir talks\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/01-oct-2024pes-vectordatabasesandai-241001142959-0510fbe6.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-04-2024-nyctechweek-discussiononvectordatabasesunstructureddataandai-240605125759-d80c571c.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-04-2024-nyctechweek-localragwithllama3andmilvus-240605113619-6e91032b.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/08-13-2024nycmeetup-unstructureddataprocessingfromcloudtoedge-240812185343-3ae3ff2b.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-12-2024milvussensordatarag-240907202906-ea0e6890.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-18-2024nycmeetup-vectordatabases102-240917124641-19bae3b0.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-19-2024aicamphybridseach-240919215006-76282317.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-25-2024njxventuresummitintroduction-240926012816-5fbfcc78.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf\n",
"!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-18-2024-princetonmeetup-introductiontomilvus-240619130633-2ad701db.pdf\n",
"\n",
"!ls talks/"
]
},
{
"cell_type": "markdown",
"id": "901b4d0e-7a65-4d68-9f74-512290e6ff7d",
"metadata": {},
"source": [
"### Iterate through PDFs and load into documents\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "7773f579-4538-4df6-9b5f-b4788ad72ec6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"talks/09-18-2024nycmeetup-vectordatabases102-240917124641-19bae3b0.pdf\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"talks/01-oct-2024pes-vectordatabasesandai-241001142959-0510fbe6.pdf\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"talks/06-04-2024-nyctechweek-localragwithllama3andmilvus-240605113619-6e91032b.pdf\n",
"talks/08-13-2024nycmeetup-unstructureddataprocessingfromcloudtoedge-240812185343-3ae3ff2b.pdf\n",
"talks/06-04-2024-nyctechweek-discussiononvectordatabasesunstructureddataandai-240605125759-d80c571c.pdf\n",
"talks/06-18-2024-princetonmeetup-introductiontomilvus-240619130633-2ad701db.pdf\n",
"talks/09-12-2024milvussensordatarag-240907202906-ea0e6890.pdf\n",
"talks/09-25-2024njxventuresummitintroduction-240926012816-5fbfcc78.pdf\n",
"talks/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead\n",
"could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"talks/09-19-2024aicamphybridseach-240919215006-76282317.pdf\n"
]
}
],
"source": [
"for file in os.listdir(path_pdfs):\n",
" if file.endswith(\".pdf\"):\n",
" pdf_path = os.path.join(path_pdfs, file)\n",
" print(pdf_path)\n",
" loader = PyPDFLoader(pdf_path)\n",
" documents.extend(loader.load())"
]
},
{
"cell_type": "markdown",
"id": "816954e4-0563-4a9b-b7f8-2f58ee174894",
"metadata": {},
"source": [
"### Connect to Milvus\n",
"\n",
"Use Milvus-Lite for local database\n",
"\n",
"This is ./milvusrag101.db\n",
"\n",
"You can easily switch to Docker for a more advance Milvus or Zilliz Cloud\n",
"\n",
"You could drop the collection if you are starting new"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "64bfbec9-9d6b-4502-b019-7b3cfa865e93",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Collection exists\n"
]
}
],
"source": [
"MILVUS_URL = \"./rag101.db\"\n",
"\n",
"client = MilvusClient(uri=MILVUS_URL)\n",
"\n",
"if client.has_collection(\"LangChainCollection\"):\n",
" print(\"Collection exists\")\n",
"else:\n",
" client.drop_collection(\"LangChainCollection\")"
]
},
{
"cell_type": "markdown",
"id": "ed5796cc-8542-42ac-b3e1-0ed9ccb72e9a",
"metadata": {},
"source": [
"### 🐍 AIM Stack - Easy Local Free Open Source RAG\n",
"\n",
"#### Choose Your Model\n",
"\n",
"https://zilliz.com/ai-models\n",
"\n",
"#### Free, Hugging Face Hosted, Open Source Apache Licensed \n",
"\n",
"https://zilliz.com/ai-models/all-minilm-l12-v2\n",
"\n",
"````\n",
"embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
"\n",
"````\n",
"\n",
"#### Powerful JINA AI Model for Text Embedding\n",
"\n",
"https://zilliz.com/ai-models/jina-embeddings-v2-base-en\n",
"\n",
"#### Next step, chunk up our big documents\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7ead04d7-1568-4f14-9bfc-03e2e421362c",
"metadata": {},
"outputs": [],
"source": [
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)\n",
"\n",
"all_splits = text_splitter.split_documents(documents)"
]
},
{
"cell_type": "markdown",
"id": "0507004c-3c1f-4433-ad11-f4bae6a98eff",
"metadata": {},
"source": [
"\n",
"#### We load our JINA AI text embedding model via HuggingFace\n",
"\n",
"#### Then we load all of the splits and embeddings to Milvus\n",
"\n",
"This will take some town and will download the model and meta data\n",
"\n",
"````\n",
" drop_old=True \n",
"````\n",
"\n",
"That will delete any previously loaded documents. If you set it to False, you don't have to reload your documents. You would be adding more and these could \n",
"be duplicates.\n",
"\n",
"\n",
"#### Verify Documents are Loaded\n",
"\n",
"We run a *similarity_search* on the newly loaded vector store\n",
"\n",
"\n",
"#### Reference\n",
"\n",
"https://milvus.io/docs/integrate_with_langchain.md\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "95ecaace-d32c-49c8-a330-7a82571d48d0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(metadata={'page': 23, 'pk': 453229992137722003, 'source': 'talks/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf'}, page_content='24\\n| © Copyright Zilliz 24\\nMilvus: From Dev to Prod \\nAI Powered Search made easy \\nMilvus is an Open-Source Vector \\nDatabase to store, index, manage, and \\nuse the massive number of embedding \\nvectors generated by deep neural \\nnetworks and LLMs. \\ncontributors \\n267+\\nstars\\n27K+\\ndownloads \\n25M+\\nforks\\n2K+\\n')]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_huggingface import HuggingFaceEmbeddings\n",
"model_kwargs = {\"device\": \"cpu\", \"trust_remote_code\": True}\n",
"\n",
"embeddings = HuggingFaceEmbeddings(model_name=\"jinaai/jina-embeddings-v2-base-de\", model_kwargs=model_kwargs)\n",
"\n",
"vectorstore = Milvus.from_documents( \n",
" documents=documents,\n",
" embedding=embeddings,\n",
" connection_args={\n",
" \"uri\": MILVUS_URL,\n",
" },\n",
" drop_old=False, \n",
")\n",
"\n",
"vectorstore.similarity_search(\"What is Milvus?\", k=1)"
]
},
{
"cell_type": "markdown",
"id": "a5769a73-a166-4b65-8dfe-6c851d59e801",
"metadata": {},
"source": [
"\n",
"#### Code our loop to call LLama 3.2\n",
"\n",
"\n",
"We will pull the RAG prompt information from LLama's hug and connect the documents loaded into Milvus with our LLM\n",
"chat with LLama 3.2\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f2b72175-6036-48d1-8e39-6c5897321718",
"metadata": {},
"outputs": [],
"source": [
"from langchain_ollama import OllamaLLM\n",
"\n",
"def run_query() -> None:\n",
" llm = OllamaLLM(\n",
" model=\"llama3.2\",\n",
" callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n",
" stop=[\"<|eot_id|>\"],\n",
" )\n",
"\n",
" query = input(\"\\nQuery: \")\n",
" prompt = hub.pull(\"rlm/rag-prompt\")\n",
"\n",
" qa_chain = RetrievalQA.from_chain_type(\n",
" llm, retriever=vectorstore.as_retriever(), chain_type_kwargs={\"prompt\": prompt}\n",
" )\n",
"\n",
" result = qa_chain.invoke({\"query\": query})\n",
" # print(result)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3caef83-6025-424f-9015-c2d586326d01",
"metadata": {},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"\n",
"Query: What indexes are available in Milvus?\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"I don't know what indexes are available in Milvus, as the provided context only mentions that it is an Open-Source Vector Database without specifying the types of indexes. The context also does not provide any information about indexing capabilities or options."
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"\n",
"Query: What is Milvus?\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Milvus is an Open-Source Vector Database used to store, index, manage, and use the massive number of embedding vectors generated by deep neural networks and LLMs. It provides an AI-powered search solution made easy. Milvus is developed in a collaborative environment with over 267 contributors."
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"\n",
"Query: What is a vector database?\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"A vector database is a type of data storage system designed for efficient storage and retrieval of dense vectors, often used in applications such as recommendation systems and computer vision. It stores and indexes large amounts of numerical data, allowing for fast querying and similarity searches. Vector databases are particularly useful for tasks that require computing distances or similarities between high-dimensional vectors."
]
}
],
"source": [
"if __name__ == \"__main__\":\n",
" while True:\n",
" run_query()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "656376ad-e37a-4753-90cc-b3e448933ba2",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "hybridsearch",
"language": "python",
"name": "hybridsearch"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.15"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment