Maheboob Patel
Apr 16, 2024

--

Hi,

i agree with your point that latency will decrease as we go further . also, we will most likely get best response from full document.

however, i still feel its counter-intuitive to pass MBs of documents in every call. network, cost of tokens and most importantly there will be use cases with scarce resources like IoT.

i feel best solution will be something like a)either 'train' model on document being discussed or b) pass once and use as many times as you want.

in fact, OpenAI Assistant API's support the later.

That’s my thoughts and i could be wrong .

--

--

No responses yet