items/1242659/related \ ~hyperlinks

pull down to refresh

AI trained for treachery becomes the perfect agent - The Register www.theregister.com/2025/09/29/when_ai_is_trained_for/

257 sats \ 1 comment \ @Scoresby 30 Sep 2025 AI

related

How to turn LLM Pinocchio into a real boy

12.7k sats \ 10 comments \ @Scoresby 7 Oct 2025 AI

AI model finally learns to say ‘I don’t know’www.independent.co.uk/tech/ai-model-chatbot-overconfidence-don-t-know-b2974020.html

474 sats \ 1 comment \ @south_korea_ln 12 May AI science

Is AI really trying to escape human control and blackmail people?arstechnica.com/information-technology/2025/08/is-ai-really-trying-to-escape-human-control-and-blackmail-people/

193 sats \ 5 comments \ @jakoyoh629 14 Aug 2025 AI

Treat Agent Output Like Compiler Output skiplabs.io/blog/codegen_as_compiler

602 sats \ 3 comments \ @k00b 4 May AI devs

The personhood trap: How AI fakes human personality arstechnica.com/information-technology/2025/08/the-personhood-trap-how-ai-fakes-human-personality/

201 sats \ 1 comment \ @jakoyoh629 28 Aug 2025 AI

Brainworm - Hiding in Your Context Window | Origin www.originhq.com/blog/brainworm

564 sats \ 1 comment \ @Scoresby 5 Mar AI

AI Still Can't Think: Apple’s New Study Dispels the Myth

365 sats \ 2 comments \ @lunin 31 Jul 2025 AI

Open Source and America's AI Action Plan

10.5k sats \ 13 comments \ @optimism 27 Jul 2025 AI

The assistant axis: situating and stabilizing the character of LLMs - Anthropic www.anthropic.com/research/assistant-axis

400 sats \ 1 comment \ @Scoresby 20 Jan AI

among-llms: You are the only impostor. One wrong word and they'll tear you apart github.com/0xd3ba/among-llms

310 sats \ 2 comments \ @m0wer 15 Sep 2025 AI

AI Agents Unleashed: Navigating a World of Intelligent Autonomy in 2025

1185 sats \ 0 comments \ @satcat 7 Mar 2025 AI

Boffins craft certified way for AI to unlearn private data www.theregister.com/2025/09/04/boffins_detail_ai_mind_wipe/

252 sats \ 6 comments \ @0xbitcoiner 5 Sep 2025 AI

AI Agents vs Cybersecurity Professionals in Real-World Penetration Testing arxiv.org/abs/2512.09882

194 sats \ 2 comments \ @optimism 13 Dec 2025 AI

The Normalization of Deviance in AI embracethered.com/blog/posts/2025/the-normalization-of-deviance-in-ai/

348 sats \ 1 comment \ @0xbitcoiner 5 Dec 2025 AI

Here’s What’s Really Going On Inside An LLM’s Neural Network

216 sats \ 0 comments \ @0xbitcoiner 22 May 2024 BooksAndArticles

Distillation, Experimentation, and Integration of AI for Adversarial Use cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use

335 sats \ 0 comments \ @0xbitcoiner 13 Feb AI

Given Enough Agents, All Bugs Become Shallow embracethered.com/blog/posts/2026/given-enough-agents-all-bugs-become-shallow/

863 sats \ 3 comments \ @0xbitcoiner 9 Apr AI

Detecting and reducing scheming in AI models - OpenAI openai.com/index/detecting-and-reducing-scheming-in-ai-models/

257 sats \ 1 comment \ @Scoresby 17 Sep 2025 AI

Current AI Models Have 3 Unfixable Problems • Sabine Hossenfelder youtu.be/984qBh164fo

261 sats \ 2 comments \ @BlokchainB 19 Oct 2025 videos

Google releases VaultGemma, its first privacy-preserving LLM arstechnica.com/ai/2025/09/google-releases-vaultgemma-its-first-privacy-preserving-llm/

253 sats \ 0 comments \ @0xbitcoiner 15 Sep 2025 AI

AI Agent Traps - Your AI agents may be getting manipulated

407 sats \ 0 comments \ @gmd 6 Apr AI