The author discovered that the APIs encrypt and round trip a model's internal reasoning to the client and use it like a cursor to get the model to resume where it left off.
For reasoning LLMs, they also do something I did not previously know about, and this is central to the error message above. They also send you the contents of the model’s hidden “reasoning” or “thinking” fields. Note that this data is not the stuff you see on ChatGPT when you ask it a question: those strings are merely summaries. The model’s actual reasoning (called “chain-of-thought”, CoT) is normally kept private and held back by the server.
The how is the easiest to answer: for both providers, “thinking”/”reasoning” are sent down to the client as JSON. Each contains a blob of Base64-encoded stuff. The API documentation informs us that this data contains opaque reasoning, and that you’re not meant to look at it; you’re just supposed to ship it back to the server on the next turn.
Then they tried to do what they weren't supposed to with it: replay and side channel attacks.
So all we need to do is mitm your session and replay the encrypted blobs.
Maybe I should revisit my decision to not become an API reseller...