pull down to refresh
The model does forget(halucinate), because every time you send a new prompt the full conversation gets sent again, and as the context grows, it's not as good at remembring.
But that's not the problem, the problem is I want to reference certain thinks, like I want us to talk about the folder structure you propose for this thing, and i want to say this part is good, this is not good, this is also good.
Yes, it is veyr much a UX thing.
re: <subject of paragraph> - its not precise enough some times, when you reference it free form it allows more room for interpretation from the model.
There are many ways I can think of to implement this and each results in different set of input tokens and tradeoffs. So I figure this feature is highly opinionated in how it's implemented -- is probably why it hasn't been released yet..
How you would represent this concept of "reply-to" as input tokens?
Transformers work by assigning a relationship-value from every token to every other token. In a normal chat interface, you could just reference the message ID and render it differently in the thread (as a link to the original message) to represent a reply.
But passing a message ID/link as input tokens to the model would waste its attention on trying to decipher the message ID.
You could copy-paste the entire paragraph you're referencing as input to the model. But then you're wasting context because that paragraph already exists in the context.
This is why something like "re: subject of paragraph" is efficient. It uses few tokens, does not use any additional ID/linking system. But it does require you to point out the subject yourself and as you say, it not precise enough.
Why does it suck tho?
Every new message already includes the entire thread appended as context, so the model isn't going to "forget" what it has already said in that session.
Is this a UX thing? You'd prefer to scrollback, click "reply" on each paragraph? Seems like more effort than just typing "re: <subject of paragraph>..."