@anon
sign up
@anon
sign up
pull down to refresh
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org/abs/2510.04721
210 sats
\
1 comment
\
@jakoyoh629
25 Oct 2025
AI
related
Hallucination Stations On Some Basic Limitations of Transformer-Based LM
arxiv.org/pdf/2507.07505
213 sats
\
0 comments
\
@0xbitcoiner
23 Jan
AI
An OpenAI model solved a famous math problem that stumped humans for 80 years
arstechnica.com/ai/2026/06/openais-math-breakthrough-played-to-ais-strengths/
338 sats
\
0 comments
\
@0xbitcoiner
1 Jun
AI
math
The ORCA Benchmark Evaluates How Well AIs Deal with Everyday Math
www.omnicalculator.com/reports/omni-research-on-calculation-in-ai-benchmark
260 sats
\
0 comments
\
@0xbitcoiner
27 Feb
AI
If LLMs Have Human-Like Attributes, Then So Does Age of Empires II
arxiv.org/abs/2605.31514
608 sats
\
1 comment
\
@Scoresby
6 Jun
lol
AI
To Make Language Models Work Better, Researchers Sidestep Language
www.quantamagazine.org/to-make-language-models-work-better-researchers-sidestep-language-20250414/
210 sats
\
0 comments
\
@0xbitcoiner
15 Apr 2025
AI
Large Language Models Pass the Turing Test
arxiv.org/pdf/2503.23674
374 sats
\
11 comments
\
@south_korea_ln
15 Apr 2025
AI
Why language models hallucinate - OpenAI
openai.com/index/why-language-models-hallucinate/
438 sats
\
4 comments
\
@Scoresby
6 Sep 2025
AI
Mathematicians issue a major challenge to AI—show us your work
www.scientificamerican.com/article/mathematicians-launch-first-proof-a-first-of-its-kind-math-exam-for-ai/
1145 sats
\
4 comments
\
@south_korea_ln
14 Feb
AI
science
How Terry Tao Became an Evangelist for AI in Math
www.quantamagazine.org/how-terry-tao-became-an-evangelist-for-ai-in-math-20260608/
1562 sats
\
2 comments
\
@0xbitcoiner
9 Jun
AI
math
science
To Have Machines Make Math Proofs, Turn Them Into a Puzzle
www.quantamagazine.org/to-have-machines-make-math-proofs-turn-them-into-a-puzzle-20251110/
268 sats
\
0 comments
\
@0xbitcoiner
11 Nov 2025
AI
Is Chain-of-Thought Reasoning of LLMs a Mirage?
arxiv.org/abs/2508.01191
427 sats
\
9 comments
\
@optimism
7 Aug 2025
AI
LLMs Can Get Brain Rot
llm-brain-rot.github.io/
287 sats
\
0 comments
\
@Scoresby
21 Oct 2025
AI
LLMs and the Specter of the Cognitive Black Hole
www.psychologytoday.com/us/blog/the-digital-self/202403/llms-and-the-specter-of-the-cognitive-black-hole
200 sats
\
0 comments
\
@ch0k1
22 Mar 2024
science
Debate May Help AI Models Converge on Truth
www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/
258 sats
\
0 comments
\
@0xbitcoiner
8 Nov 2024
science
How to turn LLM Pinocchio into a real boy
12.7k sats
\
10 comments
\
@Scoresby
7 Oct 2025
AI
Financial Statement Analysis with Large Language Models
papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311&fbclid=IwY2xjawIJNupleHRuA2FlbQIxMAABHWJxn71ESvZCS0FxEF_31oro1rwtk4rlgOst5Q4A6tuxDhxB9cgZBPizAg_aem_OAMNHiz7Vyv2bb2vt2yM0Q
222 sats
\
2 comments
\
@scatman
31 Jan 2025
AI
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection
arxiv.org/abs/2510.04849v1
433 sats
\
2 comments
\
@optimism
19 Oct 2025
AI
Bringing LLMs to the edge
www.raspberrypi.com/news/bringing-llms-to-the-edge/
285 sats
\
2 comments
\
@0xbitcoiner
26 May
AI
DIY
Vibe physics
www.math.columbia.edu/~woit/wordpress/?p=15012
2355 sats
\
4 comments
\
@south_korea_ln
1 Aug 2025
science
Transformer based AI will not lead us to AGI/ASI and is just a hype machine
3139 sats
\
18 comments
\
@cy
2 Jul 2025
AI
In a First, AI Models Analyze Language As Well As a Human Expert
www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/
274 sats
\
0 comments
\
@0xbitcoiner
31 Oct 2025
AI
more