Muhakeme and ChatGPT-o1

August 21, 2024 • min read •

Technology

Reasoning, in its literal sense, is the process of evaluating a subject or situation by thinking logically and using intellect to reach the correct conclusion. More generally, it is the ability to judge, decide, or think through something. When reasoning about an event, situation, or problem, various arguments and evidence are evaluated, and a conclusion is reached accordingly.

OpenAI’s new model GPT-o1 answers your complex questions by reasoning, that is, by performing this process of reasoning.

You know that artificial intelligences cannot actually perform reasoning or judgment; they only predict the next word based on what they learned during training. In other words, today’s AI is not truly intelligent, but an illusion. In GPT-o1, you cannot change the system messages. They even hide the reasoning tokens from us.

While work continues to make this new model as easy to use as current models, an early version of OpenAI o1-preview has been released on ChatGPT and made available to trusted API users.

Compared to GPT-4o, o1 achieved superiority across a broad range of tasks:
At AIME 2024, it scored 74% success on a single attempt, 83% success with consensus over 64 attempts, and 93% success with an improved scoring function.
At the 2024 International Olympiad in Informatics (IOI), o1 scored 213 points placing it in the 49th percentile. Each problem had 50 submission attempts allowed. However, when the submission limit was increased to 10,000 per problem, the model scored 362.14 points, surpassing the gold medal threshold. This shows the model is more capable than it seems.

Favilances Maze - Türkiye