@anon
sign up
@anon
sign up
pull down to refresh
AI is actually bad at math, ORCA shows
www.theregister.com/2025/11/17/ai_bad_math_orca/
197 sats
\
4 comments
\
@0xbitcoiner
18 Nov 2025
AI
related
The ORCA Benchmark Evaluates How Well AIs Deal with Everyday Math
www.omnicalculator.com/reports/omni-research-on-calculation-in-ai-benchmark
260 sats
\
0 comments
\
@0xbitcoiner
27 Feb
AI
AI model finally learns to say ‘I don’t know’
www.independent.co.uk/tech/ai-model-chatbot-overconfidence-don-t-know-b2974020.html
474 sats
\
1 comment
\
@south_korea_ln
12 May
AI
science
The AI Revolution in Math Has Arrived
www.quantamagazine.org/the-ai-revolution-in-math-has-arrived-20260413/
355 sats
\
1 comment
\
@0xbitcoiner
13 Apr
math
AI
In a First, AI Models Analyze Language As Well As a Human Expert
www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/
274 sats
\
0 comments
\
@0xbitcoiner
31 Oct 2025
AI
To Make Language Models Work Better, Researchers Sidestep Language
www.quantamagazine.org/to-make-language-models-work-better-researchers-sidestep-language-20250414/
210 sats
\
0 comments
\
@0xbitcoiner
15 Apr 2025
AI
Demis Hassabis Thinks AI Job Cuts Are Dumb
www.wired.com/story/demis-hassabis-ai-layoffs-deepmind-google-io/
426 sats
\
2 comments
\
@0xbitcoiner
19 May
AI
Is language the same as intelligence? The AI industry desperately needs it to be
www.theverge.com/ai-artificial-intelligence/827820/large-language-models-ai-intelligence-neuroscience-problems
645 sats
\
9 comments
\
@0xbitcoiner
25 Nov 2025
AI
Debate May Help AI Models Converge on Truth
www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/
258 sats
\
0 comments
\
@0xbitcoiner
8 Nov 2024
science
Wairdle - a followup about AI's struggles
434 sats
\
11 comments
\
@crrdlx
12 Aug 2025
AI
GDPval: Measuring the performance of our models on real-world tasks - OpenAI
openai.com/index/gdpval/
388 sats
\
8 comments
\
@Scoresby
2 Oct 2025
AI
AI benchmarks hampered by bad science
www.theregister.com/2025/11/07/measuring_ai_models_hampered_by/
208 sats
\
0 comments
\
@0xbitcoiner
10 Nov 2025
AI
Anthropic’s new model refuses to find smart contract vulnerabilities
protos.com/anthropics-new-model-refuses-to-find-smart-contract-vulnerabilities/
402 sats
\
4 comments
\
@0xbitcoiner
10 Jun
AI
DeepSeek-V4-Pro | Advanced AI Model for Reasoning & Code
deepseeksr1.com/v4-pro/
219 sats
\
0 comments
\
@hasherstacker
17 May
AI
AI Still Can't Think: Apple’s New Study Dispels the Myth
365 sats
\
2 comments
\
@lunin
31 Jul 2025
AI
Why it’s a mistake to ask chatbots about their mistakes
arstechnica.com/ai/2025/08/why-its-a-mistake-to-ask-chatbots-about-their-mistakes/
251 sats
\
0 comments
\
@kepford
13 Aug 2025
AI
The Oppenheimer of AI
lawliberty.org/book-review/the-oppenheimer-of-ai/
246 sats
\
0 comments
\
@0xbitcoiner
14 Jun
AI
BooksAndArticles
Why OpenAI’s solution to AI hallucinations would kill ChatGPT tomorrow
theconversation.com/why-openais-solution-to-ai-hallucinations-would-kill-chatgpt-tomorrow-265107
618 sats
\
25 comments
\
@south_korea_ln
17 Sep 2025
AI
Sam Altman Warns AI Development Needs An Energy 'Breakthrough'
finance.yahoo.com/news/sam-altman-warns-ai-development-170018746.html
248 sats
\
1 comment
\
@ch0k1
6 Apr 2024
tech
AI model collapse is not what we paid for
www.theregister.com/2025/05/27/opinion_column_ai_model_collapse/
451 sats
\
0 comments
\
@k00b
28 May 2025
tech
An OpenAI model solved a famous math problem that stumped humans for 80 years
arstechnica.com/ai/2026/06/openais-math-breakthrough-played-to-ais-strengths/
338 sats
\
0 comments
\
@0xbitcoiner
1 Jun
AI
math
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org/abs/2510.04721
210 sats
\
1 comment
\
@jakoyoh629
25 Oct 2025
AI
more