reply on: Save me, Daddy AI \ ~hyperlinks

pull down to refresh

143 sats \ 4 replies \ @optimism 7h \ parent \ on: Save me, Daddy AI AI the_stacker_muse

hmm I guess I should first disclose that I don't truly believe that "AI" (I read this as "an LLM") solves anything. Remember that it only does what it is trained to do + your prompt + give or take some fuzziness from randomizers. So the answer is no, I'll try to explain:

What could be the case is that there is some sort of SOP for the thing you're doing that either you don't know about or find too bothersome to follow, but the bot was trained on executing it. Even though bots get nerf'd all the time now (especially Claude), with enough budget and patience you can probably instruct it to do what you need, with some success rate between 50-80% ^[1]. It's always "bots could maybe solve this" but never a definite yes; it always remains to be seen what comes out.

What did change this past year is how I approach input. LLMs have a better flagging rate than I do because they don't get tired or bored. So I allow bots to pass me slop, then I read the slop, then I assess findings I find worth pursuing one by one. So discovery has gotten some steroids, but you need to be able to assess what it says, which is a good reason to never c&p bot slop to anyone other than yourself (because you asked for it - no one else did and if they wanna, they can ask themselves.)

Bottom line, I see the dilemma as: if you don't care about quality or your reputation, do whatever. If however you do care about these things, there is something to keep in mind: if you have deep understanding of what you're doing, bot output will be inferior to your own. If you don't know what you're doing, bot output will be inferior to those that do know, but you won't notice that.

I really have not succeeded in getting a consistent >80% success rate, in my #1 instruction field, security pre-assessments. I jokingly say that this is a skills issue on my end whenever people challenge me on that, but it's not really... it's more that a lot of anti-patterns are trained into the bots that most people, including the AI labs, would call "good enough" but in reality it's all relatively low standards. I also fear (and hope) that generically trained bots will always be inferior to specialists of the meat or silicon kind (and no, skill files do not remove training correlations.) ↩

89 sats \ 3 replies \ @Scoresby OP 6h

Solution may have been too strong a term on my part.

Here's an example: I want to write an AMA announcement for someone we have coming this week. It doesn't take long because I have a format I use for all of them and that gives the post a lot of structure. However, I like to say something interesting about each guest, kind of like a hook.

Often when pull up notepad to write this kind of post, there's this fleeting feeling that I can just have Chat generate the post. It will likely be good enough for social media. Whereas if I do it, it will probably take 5 minutes because I really want to try to make it a strong hook (often I fail at this, but the desire is the point here).

In this sense, an LLM can probably solve my problem by taking far less time and producing something that is good enough.

So, the thing I'm talking about is this: how does it change how I think and ability of my mind that I've always got this little "easy button" I can tap? -- even if I don't tap it, I resent the way it intrudes on my thought process.

122 sats \ 2 replies \ @optimism 5h

there's this fleeting feeling that I can just have Chat generate the post

I think what I am trying to say above that this feeling is deceptive. So while I understand the temptation, it's just something you have to guard yourself against, because it's the wrong use of the tool. Especially because:

AMA announcement for someone we have coming this week
[..]
I like to say something interesting about each guest, kind of like a hook.
[..]
and producing something that is good enough.

Honestly, I think you've got the right idea here. You don't let an LLM write something about someone else (unless it's a vibe song, which would be fun) because doing that the "easy" way out is directly showing how much you value the person that you're hosting. When you host someone, I think it's best to spend the time. More so than any post about the latest fake L2 (though there too it is good to just write your own thing.)

I'd try to shift perception from "LLMs deliver a product" to "LLMs deliver input". I'll give you an example from my end:

I owe k00b a plan. If I would have generated it or even parts of it, it would have been done last month, because unlike me, LLMs don't get fatigue or writers block. I do. But this stuff is important to get right because I've been designing a privacy feature and the plan covers how to implement it. One does not fuck these up, and prior experience tells me that LLMs fuck up all the time. So, it's slow, but it will be good. The way I use LLMs is to fact-check me, to test my plan for errors. To check for shit I missed and to audit end-to-end correctness. But not, absolutely not, to write the end product.

89 sats \ 1 reply \ @Scoresby OP 5h

LLMs deliver an input

This is an excellent framing and I'm going to use it a lot I think. It short circuits my complaint and perhaps if I spent more time thinking about LLMs providing inputs, it would change this weird mental desire to be lazy.

But here's my next question: imagine we live in a world where LLMs don't fuck things up. How does this change how we think?

This may just be so much navel gazing though. And not particularly useful.

122 sats \ 0 replies \ @optimism 5h

imagine we live in a world where LLMs don't fuck things up.

_{Note: fucking up is subjective!}

Current architectural issues aside, this is probably attainable when you train (not write prompts and skill files for) ScoresbysAssistant™. And I'll have OptisLittleHelper™.

At that moment it will not change your mindset, because you will own it. It will be your tool, not someone else's rotten magic. You'll be you but then you'll truly be able to do 10x productivity.