pull down to refresh

Weirdly, counseling. And it scares me.

The thing is, sometimes I just need someone to bounce a few thoughts off of, but I don't necessarily want to bother anyone or come off as complaining. So I reach for the AI. And it tends to be pretty helpful as a thought partner to just think through stuff.

I do worry about the error rate. What if it's telling me stuff that makes me feel better but is just wrong? But I don't have strong evidence that AI is more likely to lead me astray than another human being.

As long as you don't have it keep all the cross-session context (no memory) and be mindful of individual session length, error compounding shouldn't be too much of an issue, unless it's trained in goblin-bias. I remember you saying in the past that you do both, so I think that this particular risk should be relatively low (when compared to both past performance and someone just yoloing all the settings to max retention.)

When in doubt, ask Arena.

reply

I definitely keep memory off and start new chats for every new conversation/topic. Hopefully that's enough to keep things mostly on track and rational

reply

It helps a lot. One other thing that may help is to throw away conversations that got an answer that missed the spot and clarify the question in a new session, instead of arguing. Arguing definitely poisons context, and the initial tokens in the response missing the spot too. Cleaner that way.

reply

Yes.... I do argue sometimes but it definitely poisons the well, in terms of triggering sycophancy.

Lately, I've found Opus to be more sycophantic, and ChatGPT to be almost too obstinate (or too unwilling to explore heterodox opinions)

reply

Opus has been tuned for instruction following so I'd use that model to make it do things; it's been trained on Claude Code conversations. GPT is trained to solve "complex problems".

Give the models what they are trained for: Ask open research questions to GPT, maybe add some roleplay. For Claude straight out just order it to research something.

If you need validation, ask the question you think you have the answer to without revealing the answer.

reply

good thoughts

do you use any models outside of Claude and gpt?

reply

For chat? I think I only use chat once a month. Grok works fine too. If I really have a one-off that doesn't fit in the framework, I just use Arena and yolo me an answer.

For dev-adjacent work I use mainly Claude, and GPT and GLM as secondaries. I used to use Kimi before GLM-5 was released.

For operational/integrated LLM flows for work (these must be local - no spyware!!!) I use mostly Gemma and Jan for structured output work and Qwen for embeddings.

reply