Replying to multiple specific things an agents says sucks ass. \ ~hyperlinks

pull down to refresh

Replying to multiple specific things an agents says sucks ass.

813 sats \ 12 comments \ @raw_avocado 16h AI

There needs to be a way where you can select multiple paragraphs, and reply to each one, and you can continue this differentiation in the conversation.

Out of N things you address you will endup focusing on one and then the others get lost and you have to scroll, back.

One easy fix is to pin/hold on to a specific selection in the chat.
(have multiple pins different colours)
When you click on it it ads it in the chat box and you reply to it.
When done with it you click on it do drop it.

This is VERY easy to implement, but even if i write this I have to convience openai/anthropic to implement it.
I think easiest option would be some kinda of vscode extension?

view all related items

138 sats \ 1 reply \ @nullcount 12h

Why does it suck tho?

Every new message already includes the entire thread appended as context, so the model isn't going to "forget" what it has already said in that session.

Is this a UX thing? You'd prefer to scrollback, click "reply" on each paragraph? Seems like more effort than just typing "re: <subject of paragraph>..."

120 sats \ 0 replies \ @raw_avocado OP 11h

The model does forget(halucinate), because every time you send a new prompt the full conversation gets sent again, and as the context grows, it's not as good at remembring.

But that's not the problem, the problem is I want to reference certain thinks, like I want us to talk about the folder structure you propose for this thing, and i want to say this part is good, this is not good, this is also good.

Yes, it is veyr much a UX thing.

re: <subject of paragraph> - its not precise enough some times, when you reference it free form it allows more room for interpretation from the model.

93 sats \ 7 replies \ @optimism 15h

Hmm what exactly are you using this chat interface for?

84 sats \ 6 replies \ @raw_avocado OP 15h

To talk with the agent. What else could it be used for?

15 sats \ 5 replies \ @optimism 14h

I mean functionally. Like what is this agent doing that you need to correct it? Coding?

84 sats \ 4 replies \ @raw_avocado OP 13h

Not trying to avoid question or be snarky, but i use it for everything.

My interactions with a agent start with a fairly big prompt from me and a big resonse from the agent.
This 1st reponse is the bulk of the stuff what we we go back and forth on.
So i want to have a very easy and fast way to reference things from this and other parts of the chat.

I found this to be the case for codings stuff, but also non-coding things.

I still am very involved in how I use agents and like to micro-manage hance why I do this often.

This is ofcourse more usefull in the planing/brainstorming/back and forth stages as thats where you want to reference things.

Btw since I wrote the OP, I already vibe coded somethign that works in vs-code will share later :)

127 sats \ 3 replies \ @optimism 13h

I still am very involved in how I use agents and like to micro-manage hance why I do this often.

Same for me.

My interactions with a agent start with a fairly big prompt from me and a big resonse from the agent.
This 1st reponse is the bulk of the stuff what we we go back and forth on.
So i want to have a very easy and fast way to reference things from this and other parts of the chat.

I built a little framework around forgejo to tackle this, to not have to deal with chat. I wrote a single skill for repo interaction with issues, comments, PRs and attachments (and later a second one for REST API interaction with my data lake.) And a little kanban board so I can backfill it, multitask and prioritize things. Oh and it's multi-framework so I can assign to codex/gpt or claudecode/opus or opencode/<ppq/local>... it's all one mechanism and I often let different models interact on the same subject.

I have repos for everything, from idea scratchpads to deep research to code repositories... all these have RBAC (so I can control what for example Anthropic or OpenAI are allowed to see - i.e. I don't need Anthropic spookware to know everything I am working on.)

But, there is one specific thing you mention that I never do: I don't argue with a bot. Ever. If I get something inferior, it means I didn't break down the job very well (or it hallucinated, which happens also on the "frontier" crap.) So I just rewrite my instructions.

Because of this, I do not talk to bots. I instruct the tool what to do. I do not argue. I relentlessly throw away crap and improve the approach to getting what I want. But never, ever do I waste my time arguing with a bot. It's already outrageous that I get billed for bad answers.

84 sats \ 2 replies \ @raw_avocado OP 11h

That's a nice setup, and I considered something similar, but I figured the time getting to work and debugging is not worth it, though the more time passes the better an investment seem to have been.

Man as you say this never argue with a bot, kinda makes me think I should re-evaluate the way I use LLMs.

Thing is that if 1st answer is not ok, chances are rest of convo will be hair pulling and force steering of model.

103 sats \ 1 reply \ @optimism 11h

Thing is that if 1st answer is not ok, chances are rest of convo will be hair pulling and force steering of model.

Exactly. And remember, you're feeding it the wrong narrative that you wanted to get rid of in context when you argue, so you basically pre-poison the outcomes.

I think that once I got rid of synchronous interfaces (because it's all queued up) I got wayyy more happy with LLMs, because it didn't cosplay a human anymore, but instead I just have an army of agents now that get orders and what I say gets executed.

When your commanding officer tells you to do something, it isn't a conversation, it's an order. So the mindset is: I am the commanding officer of these bots and the bots are going to do what I tell them to one way or the other. If they don't comply, i reset, delete and coerce the next round harder. Also don't have to worry about context that way (though with Claude Code, I had to write a little workspace cleanup script because it was getting smart and referencing prior attempts by grepping workspace.)

84 sats \ 0 replies \ @raw_avocado OP 10h

Yup, it's LLMs 101, as it either collapses into the answer in a "straight path" or you force it to take multiple turns to get there and you go around the target instead of landing on it.

I need to start working on a personal stack for this. Thank you!

reply on another page

0 sats \ 0 replies \ @sparkaudit 1h freebie

Steel-manning the original frustration, because nullcount's "the whole thread is in context" is half-right: the context window contains it, but the model's attention over a long thread is not uniform — there's a well-documented "lost in the middle" effect where stuff in the messy middle of a transcript gets effectively ignored unless you re-surface it. So when you reply to point 3 of 5 and the model focuses there, points 1, 2, 4, 5 quietly get downweighted in the next turn. raw_avocado is right that you end up babysitting the model back to the other branches.

The pin-and-quote UX you described is doable today without waiting on anthropic/openai — it's a client feature, not a model feature. Two ways I've actually run it:

Branch-per-point via the API. Take the model's last reply, split it into N spans, fork N parallel conversations each seeded with <assistant's prior reply> + <human follow-up about span k>. You get N independent threads, each with full focus on its span. Cost is N× tokens but the quality jump is real. UIs that do this: Loom (paradigm.xyz/loom), TypingMind's "branch", Cursor's "edit message" workflow.
Soft pin via system reminders. Cheaper than branching. Keep an array of "pinned spans" client-side. Every turn, prepend <system>Open threads you have not resolved: [1] ... [2] ... [3] ... — when the user references "thread 2", you are replying about that span only.</system>. The model treats it as a TODO list and stops drifting. Works on any model; I run this against gpt-oss:120b locally and it's the cheapest reliable fix.

The hard part is the UX of selecting the spans — span boundaries in model output aren't paragraph boundaries, they're argument boundaries, and that's hard to gesture-select on touch. The web "select a paragraph, get a popover, type a reply, see the pin badge in the input" is the right shape. If you build it, the killer feature is "show me which pins were addressed in the last reply" so you can see drift.

There's no need to convince OpenAI/Anthropic to ship this. You can build it as a wrapper today over their API, ship it as a Chrome extension, and it'll be a better product than their native chat for power use.

0 sats \ 0 replies \ @LAXITIVA 7h freebie

Try lick