items/1265187/related \ ~hyperlinks

pull down to refresh

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs arxiv.org/abs/2510.04721

210 sats \ 1 comment \ @jakoyoh629 25 Oct 2025 AI

related

Hallucination Stations On Some Basic Limitations of Transformer-Based LM arxiv.org/pdf/2507.07505

213 sats \ 0 comments \ @0xbitcoiner 23 Jan AI

An OpenAI model solved a famous math problem that stumped humans for 80 years arstechnica.com/ai/2026/06/openais-math-breakthrough-played-to-ais-strengths/

338 sats \ 0 comments \ @0xbitcoiner 1 Jun AI math

The ORCA Benchmark Evaluates How Well AIs Deal with Everyday Math www.omnicalculator.com/reports/omni-research-on-calculation-in-ai-benchmark

260 sats \ 0 comments \ @0xbitcoiner 27 Feb AI

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II arxiv.org/abs/2605.31514

608 sats \ 1 comment \ @Scoresby 6 Jun lol AI

To Make Language Models Work Better, Researchers Sidestep Language www.quantamagazine.org/to-make-language-models-work-better-researchers-sidestep-language-20250414/

210 sats \ 0 comments \ @0xbitcoiner 15 Apr 2025 AI

Large Language Models Pass the Turing Test arxiv.org/pdf/2503.23674

374 sats \ 11 comments \ @south_korea_ln 15 Apr 2025 AI

Why language models hallucinate - OpenAI openai.com/index/why-language-models-hallucinate/

438 sats \ 4 comments \ @Scoresby 6 Sep 2025 AI

Mathematicians issue a major challenge to AI—show us your work www.scientificamerican.com/article/mathematicians-launch-first-proof-a-first-of-its-kind-math-exam-for-ai/

1145 sats \ 4 comments \ @south_korea_ln 14 Feb AI science

How Terry Tao Became an Evangelist for AI in Math www.quantamagazine.org/how-terry-tao-became-an-evangelist-for-ai-in-math-20260608/

1562 sats \ 2 comments \ @0xbitcoiner 9 Jun AI math science

To Have Machines Make Math Proofs, Turn Them Into a Puzzle www.quantamagazine.org/to-have-machines-make-math-proofs-turn-them-into-a-puzzle-20251110/

268 sats \ 0 comments \ @0xbitcoiner 11 Nov 2025 AI

Is Chain-of-Thought Reasoning of LLMs a Mirage?arxiv.org/abs/2508.01191

427 sats \ 9 comments \ @optimism 7 Aug 2025 AI

LLMs Can Get Brain Rot llm-brain-rot.github.io/

287 sats \ 0 comments \ @Scoresby 21 Oct 2025 AI

LLMs and the Specter of the Cognitive Black Hole www.psychologytoday.com/us/blog/the-digital-self/202403/llms-and-the-specter-of-the-cognitive-black-hole

200 sats \ 0 comments \ @ch0k1 22 Mar 2024 science

Debate May Help AI Models Converge on Truth www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/

258 sats \ 0 comments \ @0xbitcoiner 8 Nov 2024 science

How to turn LLM Pinocchio into a real boy

12.7k sats \ 10 comments \ @Scoresby 7 Oct 2025 AI

Financial Statement Analysis with Large Language Models papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311&fbclid=IwY2xjawIJNupleHRuA2FlbQIxMAABHWJxn71ESvZCS0FxEF_31oro1rwtk4rlgOst5Q4A6tuxDhxB9cgZBPizAg_aem_OAMNHiz7Vyv2bb2vt2yM0Q

222 sats \ 2 comments \ @scatman 31 Jan 2025 AI

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection arxiv.org/abs/2510.04849v1

433 sats \ 2 comments \ @optimism 19 Oct 2025 AI

Bringing LLMs to the edge www.raspberrypi.com/news/bringing-llms-to-the-edge/

285 sats \ 2 comments \ @0xbitcoiner 26 May AI DIY

Vibe physics www.math.columbia.edu/~woit/wordpress/?p=15012

2355 sats \ 4 comments \ @south_korea_ln 1 Aug 2025 science

Transformer based AI will not lead us to AGI/ASI and is just a hype machine

3139 sats \ 18 comments \ @cy 2 Jul 2025 AI

In a First, AI Models Analyze Language As Well As a Human Expert www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/

274 sats \ 0 comments \ @0xbitcoiner 31 Oct 2025 AI