Hi, i agree with your point that latency will decrease as we go further . also, we will most likely get best response from full document. however, i still feel its counter-intuitive to pass MBs of… - Maheboob Patel

Apr 16, 2024

Hi,

i agree with your point that latency will decrease as we go further . also, we will most likely get best response from full document.

however, i still feel its counter-intuitive to pass MBs of documents in every call. network, cost of tokens and most importantly there will be use cases with scarce resources like IoT.

i feel best solution will be something like a)either 'train' model on document being discussed or b) pass once and use as many times as you want.

in fact, OpenAI Assistant API's support the later.

That’s my thoughts and i could be wrong .

Written by Maheboob Patel

No responses yet