Skip to main content

QA over documents

A common use case for LLMs is to reply answers to users based on search on a knowledge base, generally using a vector db under the hood.

LangStream does not provide an agent to do that out-of-the-box, however, it is very easy to build one yourself, processing the files and then building a stream to do the search and question answering.

The goal of this guide is for you to understand every step, so you are able to modify it according to your needs later on.

Setup​

Before being able to interact with your docs, first you need to extract all embeddings from it, and index in a vector db, this will allow the LLMs to quickly retrieve and then answer questions on the matching documents. So let's get to the setup, this part has no involvement of LangStream at all.

First, let's use the unstructured library to parse all the markdown files in our docs folder:

import glob
from unstructured.partition.md import partition_md

def chunk_strings(lst, max_words):
chunks, chunk, count = [], [], 0
for s in lst:
words = len(s.split())
if count + words <= max_words:
chunk.append(s)
count += words
else:
chunks.append(chunk)
chunk, count = [s], words
if chunk:
chunks.append(chunk)
return ["\n".join(chunk) for chunk in chunks]


def load_document_chunks(pathname, max_words):
files = glob.glob(pathname, recursive=True)
chunks = []
for file in files:
print(f"Parsing {file}...")
elements = partition_md(filename=file)
elements = chunk_strings([str(elem) for elem in elements], max_words=max_words)
chunks += elements
return chunks

chunks = load_document_chunks("docs/docs/**/*.md", max_words=512)
    Parsing docs/docs/intro.md...
Parsing docs/docs/ui/chainlit.md...
Parsing docs/docs/stream-basics/why_streams.md...
Parsing docs/docs/stream-basics/error_handling.md...
Parsing docs/docs/stream-basics/custom_streams.md...
Parsing docs/docs/stream-basics/working_with_streams.md...
Parsing docs/docs/stream-basics/index.md...
Parsing docs/docs/stream-basics/type_signatures.md...
Parsing docs/docs/stream-basics/composing_streams.md...
Parsing docs/docs/examples/weather-bot.md...
Parsing docs/docs/examples/index.md...
Parsing docs/docs/examples/openai-function-call-extract-schema.md...
Parsing docs/docs/examples/weather-bot-error-handling.md...
Parsing docs/docs/llms/gpt4all.md...
Parsing docs/docs/llms/open_ai.md...
Parsing docs/docs/llms/memory.md...
Parsing docs/docs/llms/index.md...
Parsing docs/docs/llms/zero_temperature.md...
Parsing docs/docs/llms/open_ai_functions.md...

We split the docs into chunks of 512 words each so we don't blow up the LLM context length.

Now, let's get the embeddings for each of those chunks using OpenAI embeddings API:

import os
import openai

openai.api_key = os.getenv("OPENAI_API_KEY") # don't forget to define it

embeddings = []
for index, chunk in enumerate(chunks):
print(f"Generating embeddings {index + 1}/{len(chunks)}", end="\r")
result = openai.Embedding.create(
model="text-embedding-ada-002", input=chunk
)
embeddings.append(result.data[0].embedding)

print("Sample embeddings:", embeddings[0])
    Sample embeddings: [0.0007851232658140361, 0.005616822745651007, -0.016339847818017006, -0.032593999058008194, -0.0020621202420443296, 0.013340400531888008, -0.014025988057255745, 0.013511797413229942, -0.011033682152628899, -0.02715214341878891, 0.014040271751582623, 0.00818420760333538, -0.026309439912438393, 0.01916789822280407, 0.020239129662513733, 0.02353852242231369, 0.008084225468337536, -0.021353211253881454, 0.02996591106057167, 0.001305116806179285, -0.009241155348718166, -0.012961898930370808, 0.003842149628326297, -0.030194438993930817, -0.0005405254778452218, -0.0037028894294053316, 0.022095931693911552, -0.01816808246076107, -0.014825841411948204, -0.020739037543535233, 0.014340216293931007, 0.006623780354857445, -0.01428308431059122, 0.0038849988486617804, 0.00419208500534296, -0.012433424592018127, 0.02369563654065132, 0.0007119224756024778, 0.0341365709900856, -0.0042992085218429565, 0.02835192158818245, 0.03745024651288986, 0.011983507312834263, -0.004909810144454241, -0.022724386304616928, 0.014047413133084774, 0.0035172095522284508, -0.021510325372219086, -0.007977102883160114, 0.0026673658285290003, -0.006830885075032711, 0.03999263420701027, -0.017253965139389038, -0.003792158793658018, -0.007505760528147221, 0.020953284576535225, 0.006355972494930029, -0.01121936272829771, 0.010105282068252563, 0.013397532515227795, -0.013861733488738537, 0.002744137542322278, -0.033908043056726456, -0.01206920575350523, 0.006141725927591324, 0.017825288698077202, -0.006709478795528412, 0.03727884963154793, -0.007934252731502056, 0.020253412425518036, 0.00045125617180019617, 0.0027030736673623323, 0.0048276823945343494, 0.0012345940340310335, 0.024852566421031952, -0.010362377390265465, 4.541574526228942e-05, -0.012483415193855762, -0.007941395044326782, -0.014304508455097675, 0.009819620288908482, -0.015582844614982605, -0.004909810144454241, 0.03342241793870926, 0.029180340468883514, -0.007969960570335388, 0.02905179373919964, 0.017882421612739563, -0.015168635174632072, -0.025780966505408287, -0.009776771068572998, 0.000495444459374994, 0.021367494016885757, 0.03759307786822319, -0.033993739634752274, -0.009891035966575146, -0.005691809114068747, 0.013868874870240688, -0.019910618662834167, -0.010276678949594498, -0.017125418409705162, 0.011133664287626743, -0.007555751595646143, -0.011505024507641792, 0.002535247476771474, -0.028366204351186752, 0.030280137434601784, -0.005559690296649933, 0.0436776727437973, 0.012012073770165443, -0.022752953693270683, 0.04447752237319946, -0.0027209275867789984, -0.01996775157749653, 0.006723762024194002, -0.008762671612203121, 0.0005132983205839992, -0.01284763403236866, -0.007412920705974102, 0.002376348013058305, 0.030994292348623276, 0.01340467482805252, 0.0060453154146671295, 0.006041744723916054, 0.029737381264567375, 0.013911724090576172, -0.025266775861382484, -0.007984244264662266, -0.02448120526969433, -0.013797459192574024, 0.01696830429136753, -0.005845352075994015, 0.005781078245490789, 0.00617386307567358, -0.020296262577176094, 0.01988205313682556, -0.009241155348718166, -0.002965525258332491, -0.014668727293610573, 0.004220651462674141, 0.01297618169337511, 0.009898177348077297, 0.010555199347436428, -0.007234381977468729, 0.010598048567771912, 0.04104958474636078, 0.0029512422624975443, -0.019039351493120193, -0.02323857694864273, -0.002078188816085458, 0.02745208702981472, -0.0044241854920983315, -0.016097035259008408, -0.010419509373605251, 0.004759837873280048, 0.013861733488738537, 0.0060096075758337975, 0.0213246438652277, -0.010240971110761166, 0.0052847410552203655, -0.0007976209744811058, 0.016339847818017006, -0.013418957591056824, -0.016696924343705177, 0.010505208745598793, 0.027437804266810417, 0.050819214433431625, -0.0028048406820744276, -0.006284556817263365, -0.0022138780914247036, -0.007309368345886469, 0.01069088838994503, -0.053018808364868164, 0.010076715610921383, -0.0031404930632561445, -0.004874102305620909, 0.011854959651827812, -0.0003427940246183425, -0.013690335676074028, -0.01962495781481266, 0.007848554290831089, -0.002135321032255888, 0.015625694766640663, 0.03322245180606842, -0.02765205129981041, -0.025680985301733017, -3.132793790427968e-05, 0.0006887124618515372, 0.022495856508612633, -0.013169003650546074, 0.007327222265303135, 0.018696557730436325, 0.024209827184677124, -0.019910618662834167, -0.5928051471710205, -0.010833718813955784, -0.003820725018158555, -0.01886795461177826, 0.0026834344025701284, -0.006884446367621422, -0.015068653970956802, 0.003258328652009368, -0.01272622775286436, 0.018396612256765366, -0.020653339102864265, -0.015025804750621319, 0.0020889011211693287, -0.0007744109607301652, -0.0030315846670418978, -0.03442223370075226, 0.002467402722686529, -0.03330815210938454, 0.012362008914351463, 0.008091366849839687, -0.002745922887697816, 0.010569482110440731, -0.004849107004702091, 0.012090630829334259, -0.012412000447511673, 0.014897257089614868, 0.008991201408207417, 0.0038242957089096308, 0.018539443612098694, 0.00535615673288703, -0.021410342305898666, 0.02629515714943409, 0.0183109138160944, -0.019039351493120193, 0.05716090276837349, -0.0014408060815185308, -0.01641126349568367, 0.012376292608678341, -0.010598048567771912, 0.012554830871522427, -0.02343854121863842, -0.007448628544807434, 0.008105650544166565, 0.0026263021863996983, -0.013611778616905212, -0.0165683776140213, 0.04196370020508766, -0.008798379451036453, -0.008998342789709568, 0.01224774494767189, 0.01696830429136753, -0.015440014190971851, -0.005688238423317671, -0.00764145003631711, -0.006652346346527338, -0.023509956896305084, 0.04184943437576294, -0.008469868451356888, 0.025866664946079254, 0.0015666757244616747, 0.01375460997223854, 0.013518938794732094, -0.030651498585939407, -0.03542204946279526, -0.02243872545659542, 0.007712865248322487, 0.0017782439244911075, -0.01349751465022564, 0.004399189725518227, -0.03682179003953934, 0.020839018747210503, -0.001229237881489098, -0.013704619370400906, -0.008469868451356888, -0.011619288474321365, 0.02523821033537388, 0.011126522906124592, 0.003774304874241352, -0.0060595981776714325, 0.03096572682261467, -0.007534326985478401, -0.0017211115919053555, -0.014447339810431004, -0.025123944506049156, 0.02639514021575451, 0.004966942593455315, -0.030451534315943718, -0.025323908776044846, 0.018296631053090096, 0.016182733699679375, 0.01109081506729126, 0.03527921810746193, -0.030594365671277046, -0.03196554258465767, 0.007891403511166573, 0.020039167255163193, -0.012933332473039627, -0.00041532531031407416, -0.00790568720549345, -0.028980378061532974, -0.00590962590649724, 0.01841089501976967, -0.028737565502524376, 0.00346900406293571, 0.03953557834029198, -0.015725675970315933, -0.014190243557095528, -0.0036564695183187723, 0.028694715350866318, -0.0021870972122997046, -0.0203248281031847, 0.0029101783875375986, -0.02368135377764702, -0.005973899737000465, 0.01121936272829771, -0.03950700908899307, -0.007309368345886469, 0.0028191236779093742, 0.025980930775403976, -0.03933561220765114, 0.012304876931011677, -0.0070487018674612045, 0.01440449059009552, -0.009898177348077297, 0.005070494953542948, 0.00890550296753645, 0.00048205407802015543, 0.004202797543257475, -0.01962495781481266, 0.02163887210190296, 0.005174047313630581, 0.0014327719109132886, 0.018282348290085793, -0.010076715610921383, 0.019310729578137398, 0.022624405100941658, 0.038764290511608124, 0.009733921848237514, 0.010248112492263317, -0.0008766242535784841, 0.014483047649264336, 0.00615957984700799, 0.010947983711957932, -0.0035332778934389353, -0.04536307603120804, -0.03904995322227478, -0.006334547884762287, 0.018339479342103004, -0.011919233947992325, -0.0026512974873185158, -0.014711576513946056, -0.024952547624707222, -0.02529534138739109, 0.026380855590105057, 0.0014131326461210847, -0.01372604351490736, -0.021967383101582527, -0.02515251189470291, 0.01271908637136221, -0.031079990789294243, -0.01184781827032566, 0.01034095324575901, 0.005938192363828421, 0.015668543055653572, -0.01144075021147728, 0.012783360667526722, -0.008705539628863335, 0.03433653339743614, -0.004006404895335436, -0.018582291901111603, -0.021296078339219093, -0.010955125093460083, -0.0030297990888357162, 0.012376292608678341, 0.012633387930691242, -0.0023299281019717455, -0.005095490254461765, -0.018810821697115898, 0.01410454511642456, 0.00674875732511282, -0.010198121890425682, 0.03127995505928993, -0.013040455989539623, -0.008691256865859032, 0.027237841859459877, 0.012326301075518131, 0.03659326210618019, 0.00400283420458436, 0.0015202558133751154, 0.01485440693795681, 0.0058310688473284245, 0.0180681012570858, -0.007427203468978405, -0.004113528411835432, -0.01475442573428154, 0.02569526806473732, -0.007320080418139696, -0.0032672553788870573, 0.036364730447530746, 0.004070678725838661, 0.00878409668803215, 0.01269766129553318, 0.018639424815773964, -0.020196281373500824, -0.009419693611562252, -0.033450983464717865, 0.0076128835789859295, -0.016539812088012695, -0.012690519914031029, -0.0024281241931021214, 0.021895967423915863, 0.002467402722686529, -0.018639424815773964, 0.00592390913516283, 0.010176697745919228, 0.015168635174632072, -0.00419208500534296, 0.013047597371041775, -0.0095339585095644, 0.008262763731181622, 0.0102909617125988, 0.005381152033805847, -0.026637950912117958, -0.020096298307180405, -0.013661770150065422, 0.02439550682902336, -0.003458291757851839, 0.009369703009724617, -0.028466185554862022, -0.030422968789935112, -0.019182180985808372, 0.016225583851337433, 0.0011363978264853358, -0.005652530584484339, 0.021253228187561035, -0.0030244430527091026, 0.04010690003633499, -0.008919785730540752, 0.007777139078825712, -0.03102285787463188, -0.02559528686106205, 0.020610490813851357, 0.009041192010045052, -0.027180708944797516, 0.05090491101145744, 0.01296904031187296, 0.022767236456274986, 0.024252677336335182, -0.03167987987399101, 0.010240971110761166, -0.02388131618499756, 0.011605005711317062, -0.009641082026064396, 0.005473991855978966, 0.006370255257934332, -0.010198121890425682, 0.007748573087155819, 0.0018728694412857294, 0.011869242414832115, 0.023152878507971764, -0.016425546258687973, 0.0026995029766112566, 0.005995324347168207, 0.01972493901848793, -0.003088716883212328, -0.016425546258687973, -0.008834087289869785, -0.027437804266810417, -0.012340584769845009, -0.03102285787463188, -0.0006150652770884335, -0.0030458676628768444, 0.01761104352772236, -0.030394403263926506, 0.05139053612947464, 0.003917135763913393, 0.012376292608678341, 0.032251205295324326, 0.007355788256973028, 0.004149235785007477, -0.002754849847406149, -0.021910250186920166, 0.04119241610169411, -0.008391312323510647, -0.009362561628222466, -0.025566721335053444, -0.03322245180606842, 0.006898729596287012, -0.019810637459158897, -0.0002771810977719724, -0.020253412425518036, -0.0073843542486429214, 0.003877857234328985, 0.00027941283769905567, -0.021067548543214798, 0.014497330412268639, 0.036307599395513535, -0.013083305209875107, 0.024581188336014748, -0.029851645231246948, 0.02759491838514805, 0.0183109138160944, -0.02759491838514805, -0.03542204946279526, 0.03205123916268349, 0.0019353579264134169, -0.03459363058209419, 0.013190427795052528, 0.005041928496211767, -0.025066811591386795, -0.004184943623840809, 0.011405042372643948, -0.014640160836279392, -0.03307962417602539, 0.023324275389313698, 0.002669151406735182, -0.006191716995090246, 0.012433424592018127, 0.037164583802223206, 0.0064309583976864815, -0.01094084233045578, -0.01826806366443634, -0.030194438993930817, -0.0012122767511755228, 0.04179230332374573, -0.0004461232165340334, -0.0011640713782981038, 0.008969776332378387, -0.046848516911268234, -0.021738853305578232, -0.00047312714741565287, -0.021753136068582535, -0.012183470651507378, 0.00878409668803215, -0.008984060026705265, -0.0225815549492836, -0.002490612678229809, 0.016296999529004097, 0.04336344450712204, 0.017782440409064293, -0.0052347504533827305, -0.0014122399734333158, 0.02515251189470291, -0.019696373492479324, 0.04284925386309624, 0.00152471917681396, -0.008062801323831081, -0.00028186774579808116, 0.008612699806690216, 0.0072450945153832436, 0.020910434424877167, 0.024709735065698624, -0.007677157875150442, -0.02193881757557392, -0.0024923982564359903, -0.01044093444943428, 0.0024066995829343796, 0.029480285942554474, 0.009362561628222466, 0.05601825565099716, 0.0026334435679018497, 0.012504840269684792, -0.0015300753293558955, 0.001439020736142993, 0.03030870482325554, 0.01359749585390091, 0.0032458307687193155, -0.009476826526224613, 0.006577359978109598, -0.0012220963835716248, -0.00918402336537838, 0.004470605403184891, -0.010976550169289112, -0.00017742268391884863, 0.02835192158818245, 0.020010599866509438, -0.05353299900889397, 0.00043942799675278366, 0.02523821033537388, 0.018853671848773956, -0.011283636093139648, -0.0017568193143233657, 0.013540363870561123, -0.019210748374462128, -0.006991569418460131, -0.03536491468548775, -0.005099060945212841, -0.0031654885970056057, -0.007912828586995602, -0.01736823096871376, -0.026580819860100746, 0.010733737610280514, -0.009876752272248268, 0.0009016196709126234, -0.015097219496965408, -0.011283636093139648, -0.027109293267130852, -0.002029983326792717, -0.0029387446120381355, -0.010955125093460083, 0.021353211253881454, 0.005141910165548325, 0.026923613622784615, 0.010069574229419231, 0.012761935591697693, -0.030223006382584572, -0.01601133681833744, -0.015354315750300884, -0.01269766129553318, -0.011669280007481575, 0.03247973322868347, -0.012012073770165443, -0.016439829021692276, 0.026923613622784615, -0.011269353330135345, 0.004924093373119831, 0.010826577432453632, -0.011640713550150394, -0.011619288474321365, 0.021267512813210487, 0.009619656950235367, 0.013440381735563278, 0.008227056823670864, -0.01872512325644493, -0.011990648694336414, -0.027937712147831917, 0.014540179632604122, -0.012397716753184795, -0.008091366849839687, -0.019139332696795464, 0.01249769888818264, 0.03050866723060608, -0.01596848852932453, 0.008869795128703117, 0.015582844614982605, -0.021238945424556732, 0.008412736468017101, 0.011455032974481583, -0.0062559908255934715, 0.013383249752223492, 0.014640160836279392, 0.008019952103495598, -0.007634308189153671, -0.02399558201432228, 0.005306165665388107, -0.03593624010682106, 0.017853854224085808, 0.022895783185958862, -0.01081229466944933, -0.005702521186321974, 0.017725307494401932, -0.0060453154146671295, -0.025409607216715813, 0.0066094971261918545, 0.00320476689375937, 0.0009203662048093975, -0.00529545359313488, -0.024152694270014763, -0.006634492427110672, -0.0030547946225851774, -0.023210011422634125, -0.005466850474476814, -0.02148175798356533, 0.014954389072954655, -0.016839755699038506, 0.008662690408527851, 0.00825562234967947, -0.01174783706665039, -0.023309992626309395, -0.033193886280059814, -0.023652786388993263, 0.011712129227817059, -0.012647670693695545, -0.008362745866179466, -0.004470605403184891, 0.00040059586171992123, -0.04156377539038658, -0.022667255252599716, -0.0013943860540166497, -0.005359727423638105, 0.039421312510967255, 0.01912504993379116, 0.0160684697329998, 0.028423337265849113, 0.05127627030014992, 0.004317061975598335, 0.02815195918083191, -0.015368598513305187, -0.014025988057255745, -0.006870163604617119, 0.0009882109006866813, 0.009248296730220318, -0.0025513158179819584, 0.00540971802547574, -0.01866799034178257, 0.006441670935600996, 0.015111503191292286, -0.02679506503045559, -0.008105650544166565, 0.020910434424877167, -0.0225815549492836, -0.006359543185681105, -0.01261910516768694, -0.030251571908593178, -0.012390575371682644, 0.03296535834670067, -0.01281906757503748, 0.012590538710355759, -0.02685219794511795, -0.013683194294571877, 0.025409607216715813, -0.007620025426149368, 0.02936602011322975, -0.0012649456039071083, 0.027823448181152344, 0.0006561291520483792, 0.01488297339528799, 0.011926375329494476, 0.017596758902072906, 0.015297182835638523, -0.0026637951377779245, -0.03619333356618881, 0.01071945484727621, -0.004420614335685968, 0.016554094851017, 0.021038983017206192, -0.0002997215779032558, 0.007819988764822483, 0.01462587807327509, 0.008926927112042904, 0.0080342348664999, -0.009176881052553654, -0.005202613305300474, -0.008112791925668716, -0.017482494935393333, -0.02525249309837818, -0.011762119829654694, -0.03387947380542755, 0.0003483733453322202, 0.015197201631963253, 0.015468579716980457, 0.028380487114191055, -0.01635413058102131, 0.006323835346847773, 0.02916605770587921, -0.009905318729579449, 0.042535025626420975, 0.003163703018799424, 0.027266407385468483, 0.005256175063550472, 0.01896793581545353, -0.004545591305941343, 0.008548425510525703, -0.0072593772783875465, -0.017339663580060005, 0.015125785954296589, 0.020553357899188995, -0.0056739551946520805, 0.027223557233810425, 0.01736823096871376, 0.03447936475276947, -0.029937343671917915, -0.03202267363667488, 0.009483967907726765, 0.02525249309837818, 0.014590170234441757, -0.03633616492152214, -0.020839018747210503, -0.02529534138739109, 0.01851087622344494, -0.017139701172709465, 0.02635229006409645, -0.0032440454233437777, 0.0031708446331322193, -0.0004918737104162574, -0.010233829729259014, -0.00692372489720583, 0.025809532031416893, 0.0080342348664999, 0.03847862780094147, -0.007512902375310659, -0.002126394072547555, -0.035907674580812454, 0.007398637477308512, 0.028623299673199654, 0.01922503113746643, -0.011883526109158993, 0.006116730626672506, 0.045991528779268265, 0.022552989423274994, 0.004302779212594032, -0.030451534315943718, -0.012490556575357914, 0.03653612732887268, -0.038964252918958664, -0.005416859406977892, -0.012090630829334259, -0.016325565055012703, 0.04030686244368553, 3.498239675536752e-05, 0.007784280925989151, -0.004984796512871981, -0.022795801982283592, -0.001638983841985464, 0.022810084745287895, -0.020739037543535233, 0.001630056998692453, -0.013961714692413807, 0.005927479825913906, -0.0048633902333676815, -0.01721111685037613, 0.007798563688993454, 0.018382329493761063, -0.028680432587862015, 0.0007904794183559716, -0.023295709863305092, -0.009733921848237514, -0.032451167702674866, 0.0024709736462682486, 0.0001751909585436806, -0.006548793986439705, 0.04573443531990051, -0.02028197981417179, 0.019810637459158897, 0.011197937652468681, 0.021953100338578224, -0.015454296953976154, 0.01435449905693531, -0.00736292963847518, 0.0013988495338708162, 0.008762671612203121, -0.0014756211312487721, -0.0073700714856386185, -0.03482215851545334, 0.016439829021692276, -0.0004981225356459618, -0.0005896235816180706, 0.004624148365110159, -0.010833718813955784, 0.013211852870881557, 0.007698582485318184, -0.009348278865218163, -0.014997238293290138, 0.0029405299574136734, -0.004938376136124134, -0.014361641369760036, 0.024552620947360992, 0.0018907232442870736, -0.0007592351757921278, -0.02479543350636959, 0.013418957591056824, -0.0020531932823359966, -0.02890896238386631, -0.03473646193742752, 0.005631105974316597, -0.006120301317423582, -0.013740327209234238, -0.004027829505503178, -0.012604821473360062, -0.020796170458197594, -0.025838099420070648, 0.011133664287626743, 0.005556119605898857, 0.012533405795693398, 0.03127995505928993, 0.02268153801560402, 0.028237657621502876, 0.013833167031407356, -0.0056561012752354145, -0.009276863187551498, -0.021767420694231987, 0.024152694270014763, -0.009241155348718166, -0.00022942203213460743, -0.004556303843855858, 0.060446012765169144, 0.011347910389304161, -0.03767877444624901, -0.024981113150715828, -0.038964252918958664, -0.04130667820572853, 0.01094084233045578, -0.016696924343705177, -0.005773936863988638, 0.006013178266584873, 0.00180056132376194, 0.006784464698284864, 0.03293679282069206, 0.010955125093460083, 0.01425451785326004, -0.03387947380542755, 0.012040640227496624, -0.007720007095485926, -0.006930866744369268, -0.012911908328533173, 0.012426283210515976, -0.0022085218224674463, -0.00013613564078696072, 0.007348646875470877, 0.02048194222152233, -0.014590170234441757, 0.040621090680360794, -0.008219914510846138, -0.010198121890425682, 0.005152622703462839, -0.035964805632829666, 0.003076219232752919, 0.006520227994769812, -0.013261843472719193, -0.02695217914879322, 0.021853119134902954, 0.02433837577700615, -0.01422595139592886, 0.017196834087371826, -0.003774304874241352, -0.017325380817055702, -0.004131381865590811, 0.0029512422624975443, -0.027437804266810417, -0.01004100777208805, -0.007484335917979479, 0.023852750658988953, 0.042135097086429596, -0.004434897564351559, 0.001133719808422029, 0.019339295104146004, 0.006791606545448303, -0.009419693611562252, -0.005424001254141331, 0.015739958733320236, -0.0038814281579107046, -0.015882790088653564, 0.006505944766104221, -0.01976778917014599, 0.004420614335685968, -0.013297551311552525, 0.0056739551946520805, -0.00994102656841278, 0.008077084086835384, -0.012304876931011677, -0.00775571446865797, -0.036564696580171585, 0.016639793291687965, 0.007584317587316036, 0.018682273104786873, -0.014290225692093372, 0.0027994844131171703, -0.0058346400037407875, 0.012426283210515976, -0.042934950441122055, 0.017025435343384743, -0.030937159433960915, -0.010962267406284809, 0.007777139078825712, -0.008062801323831081, -0.02499539777636528, 0.01453303825110197, 0.006741615477949381, -0.021767420694231987, -0.002981593832373619, 0.2123037576675415, -0.012283451855182648, 0.007969960570335388, 0.03416513651609421, 0.008848370984196663, 0.004681280814111233, 0.004231363534927368, 0.008219914510846138, -0.01324756070971489, 0.0043134912848472595, 0.004259929992258549, -0.0032172647770494223, -0.011226504109799862, -0.004795545246452093, -0.00023009155120234936, 0.006477378774434328, -0.016954021528363228, -0.02433837577700615, -0.00503121642395854, -0.005352585576474667, -0.007091551087796688, -0.016525527462363243, -0.011405042372643948, -0.022824367508292198, 0.0027405668515712023, 0.013440381735563278, -0.014668727293610573, 0.007441486697643995, 0.02553815394639969, -0.006405963096767664, -0.01922503113746643, 0.020867586135864258, -0.018839387223124504, 0.00440276088193059, -0.010733737610280514, -0.008391312323510647, 0.03170844539999962, 0.0014604453463107347, 0.019339295104146004, 0.01872512325644493, 0.0346221961081028, -0.009726780466735363, -0.0003307426522951573, -0.03959270939230919, -0.0023567089810967445, -0.025109661743044853, 0.006016748957335949, -0.03685035556554794, 0.00978391245007515, 0.0034350818023085594, -0.01826806366443634, -0.009769629687070847, 0.04362053796648979, 0.020710472017526627, -0.02288150042295456, -0.0016505889361724257, -0.01686832308769226, 0.023152878507971764, 0.028094826266169548, 0.021995948627591133, -0.013947431929409504, 0.02223876118659973, 0.004481317475438118, 0.020267697051167488, -0.01595420576632023, -0.0029012514278292656, -0.005102631635963917, -0.011619288474321365, -0.0023406404070556164, -0.0095339585095644, -0.01846802793443203, -0.0223244596272707, -0.022952916100621223, -0.010876568034291267, -0.03705032169818878, -0.04236362874507904, 0.03342241793870926, -0.0021103257313370705, 0.019296446815133095, 0.0401640310883522, -0.0037707341834902763, -0.015997054055333138, -0.0007007638341747224, 0.018553726375102997, -0.0032083378173410892, -0.011412183754146099, 0.019639240577816963, -0.002296005841344595, -0.009955309331417084, -0.014033130370080471, -5.506798333954066e-05, -0.013169003650546074, 0.015140068717300892, 0.017225399613380432, -0.0021799555979669094, -0.011662137694656849, 0.004481317475438118, 0.0033940179273486137, -0.011862101033329964, -0.004070678725838661, -0.019810637459158897, 0.01651124469935894, -0.002761991461738944, -0.001553285401314497, -0.03639329969882965, 0.0024459781125187874, -0.02016771398484707, -0.0028030553366988897, 0.002279937267303467, -0.0039028527680784464, -0.0006436315015889704, -0.029794514179229736, 0.017196834087371826, -0.009548241272568703, 0.0058489227667450905, 0.006534510757774115, 0.0036475425586104393, -0.011619288474321365, 0.00928400456905365, -0.002733425237238407, -0.021910250186920166, 0.01294761523604393, -0.02278151921927929, -0.020296262577176094, -4.123124745092355e-05, -0.02363850362598896, -0.015397164970636368, -0.01372604351490736, -0.018239498138427734, 0.00818420760333538, 0.03370807692408562, -0.018696557730436325, 0.0005155300605110824, 0.022095931693911552, -0.02383846789598465, 0.0095339585095644, -0.00764145003631711, -0.002287078881636262, 0.0035207802429795265, 0.008819804526865482, -0.011276494711637497, 0.017153983935713768, -0.018096666783094406, -0.038878556340932846, 0.038364361971616745, -0.028337638825178146, -0.009248296730220318, 0.007116546388715506, -0.014111687429249287, -0.0013274340890347958, -0.04030686244368553, 0.002524535171687603, -0.005952475126832724, -0.022453008219599724, 0.02163887210190296, -0.001944284769706428, -0.020553357899188995, 0.002508466597646475, 0.024681169539690018, 0.008012809790670872, -0.01962495781481266, -0.004741983953863382, -0.018482310697436333, 0.0016496962634846568, -0.021196097135543823, -0.038421496748924255, -0.18328052759170532, 0.008112791925668716, 0.020696189254522324, -0.03999263420701027, 0.022910065948963165, -0.004027829505503178, 0.005852493457496166, -0.002708429703488946, 0.0013890299014747143, -0.007912828586995602, -0.00036332596209831536, -0.01791098713874817, -0.02363850362598896, -0.020753320306539536, -0.019382145255804062, 0.005731087643653154, -0.00444203894585371, -0.006106018554419279, 0.04602009803056717, -0.0023174304515123367, 0.029651682823896408, -0.03050866723060608, -0.001897864742204547, 0.012312018312513828, 0.02555243670940399, 0.02545245550572872, -0.017396796494722366, -0.027980562299489975, -0.01528290007263422, -0.032908227294683456, -0.02855188585817814, 0.008427019231021404, 0.013676052913069725, -0.005345444194972515, 0.0213246438652277, 0.024724017828702927, 0.04673425108194351, -0.0005307058454491198, -0.030137307941913605, 0.022424442693591118, 0.03322245180606842, 0.04336344450712204, 0.016382697969675064, -0.020310545340180397, -0.003899281844496727, -0.0024709736462682486, 0.011862101033329964, -0.021924534812569618, 0.010826577432453632, 0.012461991049349308, 0.02038196101784706, -0.02545245550572872, 0.0036636111326515675, 0.00790568720549345, 0.003227977082133293, 0.03873572498559952, -0.003001233097165823, 0.00419208500534296, -0.016239866614341736, -0.020910434424877167, 0.02152460813522339, -0.028266223147511482, -0.01350465603172779, -0.006455954164266586, -0.015354315750300884, -0.03516495227813721, -0.025066811591386795, 0.031079990789294243, -0.024052713066339493, 0.037164583802223206, -0.010955125093460083, -0.0008538606343790889, -0.01256911363452673, -0.018296631053090096, 0.029851645231246948, 0.01194065809249878, 0.00692372489720583, 0.030680064111948013, 0.03147991746664047, -0.013490373268723488, -0.009141174145042896, 0.032908227294683456, -0.009826761670410633, 0.02896609529852867, -0.006845167838037014, 0.011526448652148247, 0.021910250186920166, 0.017125418409705162, -0.05027645453810692, -0.01385459117591381, 0.024466922506690025, -0.007009423337876797, -0.034507930278778076, -0.008705539628863335, -0.00863412395119667, 0.02483828365802765, 0.03785017132759094, -0.0060631693340837955, -0.010169555433094501, -0.0031476346775889397, 0.031537048518657684, 0.020367678254842758, 0.014247376471757889, 0.02525249309837818, 0.03167987987399101, 0.026666518300771713, -0.022410158067941666, 0.018396612256765366, 0.022067364305257797, -0.00360112264752388, -0.04507741332054138, 0.0037671634927392006, -0.007177249528467655, 0.04473461955785751, -0.010869426652789116, 0.019810637459158897, -0.018982218578457832, -0.014090262353420258, 0.0045777284540236, -0.0012551259715110064, 0.018482310697436333, -0.010048149153590202, 0.00636311387643218, 0.014825841411948204, -0.005063353106379509, -0.021653154864907265, -0.09066902101039886, -0.04522024467587471, 0.003984980285167694, 0.027523502707481384, 8.525215525878593e-05, 0.021895967423915863, -0.0005199935403652489, 0.022767236456274986, -0.0040849619545042515, 0.03379377722740173, -0.010076715610921383, -0.006955862045288086, -0.0012836921960115433, 0.007991385646164417, -0.015611411072313786, 0.009048333391547203, -0.011083672754466534, -0.004395619034767151, -0.0035279218573123217, 0.012212037108838558, 0.005384722724556923, -0.010640897788107395, 0.00737721286714077, -0.010919418185949326, 0.00041778021841309965, -0.029537416994571686, -0.027537785470485687, 0.017639609053730965, -0.00147740647662431, 0.013119013048708439, 0.03056580014526844, -0.02223876118659973, 0.0012926191557198763, -0.007477194536477327, 0.021610306575894356, -0.02293863333761692, -0.005731087643653154, 0.0032726116478443146, 0.03267969563603401, -0.020539075136184692, 0.00360112264752388, 0.004931234754621983, 0.004559874534606934, -0.008141357451677322, -0.011076531372964382, 0.002362065017223358, -0.01009099930524826, 0.024266960099339485, -0.013383249752223492, -0.0170682854950428, -0.03639329969882965, -0.02388131618499756, -0.025523871183395386, 0.010598048567771912, 0.04382050037384033, -0.004759837873280048, 0.015682825818657875, 0.01651124469935894, -0.007841412909328938, -0.03336528316140175, 0.009105466306209564, -0.009605374187231064, -0.01525433361530304, 0.02599521353840828, 0.01896793581545353, 0.028237657621502876, -0.0316227488219738, 0.0015764953568577766, 0.016268432140350342, -0.004981225356459618, -0.0020906864665448666, 0.02268153801560402, -0.008148499764502048, 0.03587910532951355, -0.018939370289444923, -0.006081023253500462, 0.009948167949914932, -0.019282164052128792, 0.013019030913710594, 0.013040455989539623, -0.02765205129981041, -0.0005324912490323186, 0.002274581231176853, -0.005809644237160683, -0.0031779862474650145, 0.003297606948763132, -0.0037993004079908133, 0.01726824790239334, 0.0062060002237558365, -0.039621274918317795, 0.01686832308769226, 0.00787712074816227, 0.014133111573755741, 0.022510141134262085, -0.005391864106059074, -0.031737010926008224, -0.008112791925668716, 0.03647899627685547, -0.014018846675753593, 0.026437988504767418, -0.05007649213075638, -0.01284763403236866, -0.07558608055114746, 0.023967014625668526, -0.023295709863305092, -0.0426492877304554, 0.02719499170780182, -0.03407943993806839, -0.0112907774746418, -0.03442223370075226, 0.030251571908593178, 0.0020174856763333082, 0.0030422969721257687, 0.0037314556539058685, -0.02398129738867283, 0.0012810140615329146, 0.012790502049028873, -0.004527737852185965, 0.04336344450712204, -0.007720007095485926, 0.02735210582613945, 0.008241339586675167, 0.015182917937636375, -0.015725675970315933, 0.012397716753184795, 0.016454113647341728, -0.02769489958882332, 0.018253780901432037, -0.010655180551111698, 0.026123760268092155, -0.009469685144722462, -0.024281242862343788, 0.0020246270578354597, -0.025880947709083557, -0.0012212037108838558, 0.029623115435242653, 0.02725212462246418, -0.015268616378307343, 0.009469685144722462, 0.011419326066970825, 0.02293863333761692, 0.01685403846204281, -0.017239682376384735, -0.026866480708122253, 0.013218994252383709, 0.0020049880258738995, -0.01515435241162777, -0.017253965139389038, 0.015982771292328835, -0.011583581566810608, 0.0411638468503952, -0.011976365931332111, 0.02760920114815235, 0.014997238293290138, -0.021610306575894356, -0.019796354696154594, 0.0038242957089096308, -0.016554094851017, 0.02735210582613945, -0.015682825818657875, -0.004059966653585434, 0.018396612256765366, 0.01621130108833313, 0.005523982923477888, 0.017296815291047096, -0.004338486585766077, -0.007055843714624643, -0.026238026097416878, -0.020110582932829857, -0.002717356663197279, -0.0030637215822935104, -0.01665407605469227, -0.013968856073915958, 0.0002680309989955276, 0.003583268728107214, -0.01306902151554823, 0.009148315526545048, 0.013026172295212746, 0.01221917849034071, 0.008291330188512802, -0.0371360182762146, 0.01836804673075676, 0.013161862269043922, -0.008991201408207417, -0.016539812088012695, -0.009998158551752567, 0.02695217914879322, 0.02639514021575451, -0.010783728212118149, -0.012833351269364357, -0.018596574664115906, -0.029280321672558784, -0.01646839641034603, 0.00023098425299394876, -0.012019215151667595, 0.029180340468883514, -0.0013318975688889623, -0.002590594347566366, 0.003088716883212328, 0.00528117036446929, 0.01515435241162777, -0.024024147540330887, 0.02016771398484707, -0.012861916795372963, -0.0027137859724462032, 0.003240474732592702, -0.009191164746880531, -0.002876256126910448, -0.005370439495891333, -0.029651682823896408, 0.014390206895768642, -0.007812847383320332, -0.00261023361235857, -0.008762671612203121, -0.009326853789389133, 0.015440014190971851, -0.017039719969034195, 0.02605234459042549, 0.008269906044006348, -0.014590170234441757, -0.017025435343384743, 0.015639977529644966, 0.016639793291687965, 0.018553726375102997, 0.023524239659309387, -0.02409556321799755, -0.0027869867626577616, 0.010733737610280514, 0.01437592413276434, -0.014997238293290138, 0.013983138836920261, 0.011626430787146091, -0.024609753862023354, 0.011640713550150394, -0.02369563654065132, -0.024966830387711525, 0.020910434424877167, 0.014925822615623474, -0.026423705741763115, 0.008984060026705265, -0.013111870735883713, 0.08918357640504837, 0.010133848525583744, -0.012533405795693398, -0.000324047461617738, -0.012304876931011677, 0.0165683776140213, -0.0005445425631478429, 0.00649880338460207, -0.007020135875791311, -0.01194779947400093, 0.009791053831577301, -0.029994476586580276, -0.028423337265849113, -0.037621643394231796, -0.011790686286985874, -0.011869242414832115, -0.01181925181299448, 0.037821605801582336, -0.00469199288636446, 0.002867329167202115, 0.025266775861382484, 0.019282164052128792, -0.01306902151554823, -0.026509404182434082, -0.012519123032689095, 0.004042112734168768, 0.019896335899829865, 0.013940289616584778, 0.0023674212861806154, -0.04376336932182312, 0.02729497291147709, 0.018739406019449234, -0.03216550499200821, -0.025666702538728714, -0.03167987987399101, 0.013661770150065422, 0.018782256171107292, 0.000457505026133731, 0.02028197981417179, -0.0006208678241819143, -0.0033493831288069487, 0.007062985096126795, -0.015225767157971859, -0.0471913106739521, -0.0015398949617519975, 0.03350811451673508, -0.022567272186279297, -0.00828418880701065, -0.06633064150810242]

Now let's index those embeddings using chroma, an easy to use local vector db:

import chromadb

client = chromadb.PersistentClient("/tmp/docsdb")
collection = client.create_collection("docs-qa")

collection.add(
documents=chunks,
embeddings=embeddings,
ids=[str(index) for index, _ in enumerate(chunks)],
)

Cool, now we can easily build a function that retrieves documents from our vector db:

from typing import AsyncGenerator
from langstream import as_async_generator

def retrieve_documents(query: str, n_results: int) -> AsyncGenerator[str, None]:
query_embedding = (
openai.Embedding.create(model="text-embedding-ada-002", input=query)
.data[0] # type: ignore
.embedding
)

results = collection.query(
query_embeddings=query_embedding,
n_results=n_results,
)["documents"]
results = results[0] # type: ignore
return as_async_generator(*results)

async for document in retrieve_documents("How to use GPT4All", n_results=2):
print(document[20:200] + "...\n")
    GPT4All LLMs
LLMs require a lot of GPU to run properly make it hard for the common folk to set one up locally. Fortunately, the folks at GPT4All are doing an excellent job in reall...

LLMs
Large Language Models like GPT-4 is the whole reason LangStream exists, we want to build on top of LLMs to construct an application. After learning the Stream Basics, it should ...

Execution​

Finally, with all our documents indexed and our query function, we can now write a stream that will reply the user questions about those documents using an LLM:

from typing import Iterable
from langstream import Stream, filter_final_output
from langstream.contrib import OpenAIChatStream, OpenAIChatMessage, OpenAIChatDelta


def stream(query):
return (
Stream[str, str](
"RetrieveDocumentsStream",
lambda query: retrieve_documents(query, n_results=3),
).and_then(
OpenAIChatStream[Iterable[str], OpenAIChatDelta](
"AnswerStream",
lambda results: [
OpenAIChatMessage(
role="system",
content="You are a helpful bot that helps users answering questions about documentation of the LangStream library",
),
OpenAIChatMessage(role="user", content=query),
OpenAIChatMessage(
role="user",
content=f"Here are the results of the search:\n\n {' | '.join(list(results))}",
),
],
model="gpt-3.5-turbo",
temperature=0,
)
)
)(query)


async for output in filter_final_output(stream("How do I combine two streams?")):
print(output.content, end="")
    To combine two streams in LangStream, you can use the `and_then()` function. This function allows you to compose two streams together by taking the output of one stream and using it as the input for another stream.

Here's an example of how to combine two streams using `and_then()`:

```python
from langstream import Stream

# Define the first stream
stream1 = Stream[str, str]("Stream1", lambda input: input.upper())

# Define the second stream
stream2 = Stream[str, str]("Stream2", lambda input: input + "!")

# Combine the two streams
combined_stream = stream1.and_then(stream2)

# Run the combined stream
output = combined_stream("hello")
print(output) # Output: "HELLO!"
```

In this example, `stream1` takes a string as input and converts it to uppercase. `stream2` takes a string as input and adds an exclamation mark at the end. The `and_then()` function combines `stream1` and `stream2` together, so the output of `stream1` is passed as input to `stream2`. Finally, we run the combined stream with the input "hello" and get the output "HELLO!".

You can stream together as many streams as you need using `and_then()`. Each stream will receive the output of the previous stream as its input.

Bonus: map-rerank document scoring​

In the previous example, we used chromadb to return the top-3 results according to embeddings, however, sometimes you have a lot more documents, and the top-scoring provided by the embeddings on the vector db might not be the best, at the same time, we cannot just send all the documents to the LLM due to context length limit.

What we can do then is to leverage the LLM inteligence for better scoring, by returning more results from the vectordb search at first, and then ask the LLM to score each of them in parallel, to then pick just the best ones for the final answer.

We can use gather for running the scoring in parallel, this is the implementation:

import json
from typing import Iterable, List, Tuple
from langstream import Stream, filter_final_output
from langstream.contrib import OpenAIChatStream, OpenAIChatMessage, OpenAIChatDelta
from openai_function_call import openai_function


def stream(query):
@openai_function
def score_document(score: int) -> int:
"""
Scores the previous document according to the user query

Parameters
----------
score
A number from 0-100 scoring how well does the document matches the query
"""

return score

scoring_stream: Stream[str, int] = OpenAIChatStream[str, OpenAIChatDelta](
"AnswerStream",
lambda document: [
OpenAIChatMessage(
role="system",
content="You are a scoring systems that classifies documents from 0-100 based on how well they answer a query",
),
OpenAIChatMessage(
role="user",
content=f"Query: {query}\n\nDocument: {document}",
),
],
model="gpt-3.5-turbo",
temperature=0,
functions=[score_document.openai_schema],
function_call={"name": "score_document"},
).map(
lambda delta: score_document(**json.loads(delta.content))
if delta.role == "function" and delta.name == "score_document"
else 0
)

documents: List[str] = []

def append_document(document):
nonlocal documents
documents.append(document)
return document

return (
Stream[str, str](
"RetrieveDocumentsStream",
lambda query: retrieve_documents(query, n_results=10),
)
# Store retrieved documents to be used later
.map(append_document)
# Score each of them using another stream
.map(scoring_stream)
# Run it all in parallel
.gather()
# Pair up documents with scores
.and_then(lambda scores: zip(documents, [s[0] for s in scores]))
# Take the top 3 scored documents
.and_then(
lambda scored: sorted(list(scored)[0], key=lambda x: x[1], reverse=True)[:3]
)
# Now use them to build an answer
.and_then(
OpenAIChatStream[Iterable[List[Tuple[str, int]]], OpenAIChatDelta](
"AnswerStream",
lambda results: [
OpenAIChatMessage(
role="system",
content="You are a helpful bot that helps users answering questions about documentation of the LangStream library",
),
OpenAIChatMessage(role="user", content=query),
OpenAIChatMessage(
role="user",
content=f"Here are the results of the search:\n\n {' | '.join([ doc for doc, _ in list(results)[0] ])}",
),
],
model="gpt-3.5-turbo",
)
)
)(query)


async for output in filter_final_output(stream("How do I add memory?")):
print(output.content, end="")
    To add memory to your LangStream library, you can follow these steps:

1. Define a memory variable: This can be a list, dictionary, or any other data structure that suits your needs. For example, you can define a list called `memory` to store previous chat messages.

2. Create a function to save messages to memory: This function should take a message as input and append it to the memory variable. For example, you can create a function called `save_message_to_memory` that appends the message to the `memory` list.

3. Update the stream to include memory operations: Depending on your use case, you may need to update your stream to include memory operations. This can be done using the `map` function to apply the memory operations to each message in the stream. For example, you can use the `map` function to call the `save_message_to_memory` function for each message in the stream.

Here is an example code snippet that demonstrates how to add memory to a LangStream library:

```python
from langstream import Stream

# Define the memory variable
memory = []

# Function to save messages to memory
def save_message_to_memory(message):
memory.append(message)
return message

# Create the stream
stream = Stream().map(save_message_to_memory)

# Use the stream
output = stream("Hello, world!")
print(output)

# Access the memory
print(memory)
```

In this example, the `save_message_to_memory` function appends each message to the `memory` list. The `map` function is used to apply the `save_message_to_memory` function to each message in the stream. Finally, the `memory` variable can be accessed to retrieve the stored messages.

Remember to customize the code according to your specific requirements and use case.

That's it, it works! If there is anything you don't understand in this example, join our discord channel and share your questions with the community.