pull down to refresh
At this point every week there’s a new “insane benchmark” headline
True. A year ago o3 was the best model on the market. Progress is fast.
Real test is still, can it actually help without hallucinating halfway through the task?
Have you used a SOTA model in opencode yet? Chatbots still do that - the progress of agents is on another level tho.
True. A year ago o3 was the best model on the market. Progress is fast.
Have you used a SOTA model in opencode yet? Chatbots still do that - the progress of agents is on another level tho.