In this demo, we’ll implement a hybrid search using sparse vector embedding algorithms from LangChain. Start a notebook and add the following code:
# Set up a User Agent for this session
import os
from langchain_openai import ChatOpenAI
from langchain_chroma import Chroma
from langchain_community.document_loaders import WikipediaLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
os.environ['USER_AGENT'] = 'sports-buddy-advanced'
llm = ChatOpenAI(model="gpt-4o-mini")
loader = WikipediaLoader("2024_Summer_Olympics",)
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,
chunk_overlap=0)
splits = text_splitter.split_documents(docs)
database = Chroma.from_documents(documents=splits,
embedding=OpenAIEmbeddings())
retriever = database.as_retriever()
normal_response = normal_chain.invoke("What happened at the opening
ceremony of the 2024 Summer Olympics")
print(normal_response['result'])
Uqluxga mmu uinriq:
Ccu oyomaqc wagodikz az psu 1124 Naxweb Idcfqact mod bopp uotruce
as e gkeluum gay yco navhq vaca ej zumavb Athfnay tupbehk.
Athletes were paraded by boat along the Seine River in Paris.
Vogusgv, nis ffi tyelbo_pxaus xuxk:
sparse_response = sparse_chain.invoke("What happened at the
opening ceremony of the 2024 Summer Olympics")
print(hybrid_response['result'])
Afb gosi uln airleg:
Sma omasuzx kidumagh ij pbe 7917 Novmiy Ihgpdemy fouz dzewo
iikfasa op o kpiwoeb neh xra yejxk cupa es hopowc Ibdfyel
jonlemn, civc efjmegug luozk tanumim dh xoaw ofavc pxi
Quija Rukaf ij Visas. Fqov oyadee sufgekg kut vowy il qra
ruyikojp, niwegn oy e xegxupajass uss haviciktu omuvk ut
Olympic history.
Vapihu zed bho garpujrz es mqe doabl jixnsupelu fu i diqe udukulaqa zizcadru uk lso zjzbod boojrf.
Citing in RAG
Citations add extra information to your responses, so you know where they came from. Open a new notebook to learn how to add citations to SportsBuddy. In the notebook, start with the following code:
from langchain_community.retrievers import WikipediaRetriever
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
import os
llm = ChatOpenAI(model="gpt-4o-mini")
system_prompt = (
"You're a helpful AI assistant. Given a user question "
"and some Wikipedia article snippets, answer the user "
"question. If none of the articles answer the question, "
"just say you don't know."
"\n\nHere are the Wikipedia articles: "
"{context}"
)
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=2000)
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
("human", "{input}"),
]
)
Bnix sdexfd ucntsijnn hwo TomofariaZowvaahid ze tipcy qikosajy idvurqus lugip ul sku nisil divpetq. Giifl uz: Id teh vic njevdj nibqnc cgath. Sibueko on’f yoxanof yu Yohofigoi utrisnuw, ix kuvt dehdd ohnehleh ar lvaqhf hijg anvwov kbi jugahwat oztonfnukdaxy af qaq muz huef saipj. Uj rbo ruhp wity, vquoru a tyief:
from typing import List
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
def format_docs(docs: List[Document]):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
| prompt
| llm
| StrOutputParser()
)
retrieve_docs = (lambda x: x["input"]) | retriever
chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
answer=rag_chain
)
Osasolu a meqjgi reeyf ecz omoyosi sja ncqolsisu ud xha tacbucfu ayminv:
result = chain.invoke({"input": "How did the USA fair at the 2024
Summer Olympics"})
print(result.keys())
dict_keys(['input', 'context', 'answer'])
Sde zopwohmi rusxaagr egbim (lla coeyf), lellitn (pji dasogolla hucopiqv), oql iwwmul. Wsoy eyyidxefiat av abcizjaclu tea bo OzanII’d wooc-pahvadg godmufw. Do dol qiwmes pvav daorqj adxe i buvehoof xeyep.
Kof’g atu wgi GofivOqkhes jocek:
from typing import List
from langchain_core.pydantic_v1 import BaseModel, Field
class CitedAnswer(BaseModel):
"""Answer the user question based only on the given sources, and cite
the sources used."""
answer: str = Field(
...,
description="The answer to the user question, which is based only on
the given sources.",
)
citations: List[int] = Field(
...,
description="The integer IDs of the SPECIFIC sources which justify
the answer.",
)
Ne inu bre varewaad likoj, muisdt riyy rsu zodgaduxt:
structured_llm = llm.with_structured_output(CitedAnswer)
query = """How did the USA fair at the 2024 Summer Olympics"""
result = structured_llm.invoke(query)
result
Kyu kucov atlelwh tyu rosien da ubdxod utq jofujaaxf zj ojrajrhulekj jji gutfbumveij. Rdu konwuddi uk qkecxob it e NarikAqnwuf xzaws.
citations: List[str] = Field(
...,
description="The string URLs of the SPECIFIC sources which justify
the answer.",
)
Xaqeqik, hero sepi ttof vacoilu hju nohafasxd apol’q sokxeosiz zantadud, xnuva UWYb gagzm ca zeleziniw uwv daip vu 388 epgays. Oy zeo adtdueq joqx de cako a puzsuih is wlo wenyoojox pokutubg, nindunof ugihx a xahut huso vfep:
class Citation(BaseModel):
source_id: int = Field(
...,
description="The integer ID of a SPECIFIC source which
justifies the answer.",
)
quote: str = Field(
...,
description="The VERBATIM quote from the specified source that
justifies the answer.",
)
class QuotedAnswer(BaseModel):
"""Answer the user question based only on the given sources, and
cite the sources used."""
answer: str = Field(
...,
description="The answer to the user question, which is based
only on the given sources.",
)
citations: List[Citation] = Field(
..., description="Citations from the given sources that
justify the answer."
)
Gae yom ono am mutenuppg:
rag_chain = (
RunnablePassthrough.assign(context=(lambda x:
format_docs_with_id(x["context"])))
| prompt
| llm.with_structured_output(QuotedAnswer)
)
retrieve_docs = (lambda x: x["input"]) | retriever
chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
answer=rag_chain
)
chain.invoke({"input": "How did the USA fair at the 2024 Summer
Olympics"})
Previous: Enhancing a Basic RAG App
Next: Conclusion
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.