OpenAI’s new ad shows “reasoning” AI making fundamental mistakes

OpenAI’s new ad shows “reasoning” AI making fundamental mistakes

OpenAI released its most advanced AI model to date, called o1, to paying users on Thursday. The launch marked the start of the company’s “12 Days of OpenAI” event – a dozen consecutive releases to celebrate the holiday season.

OpenAI has touted o1’s “complex thinking” capabilities and announced Thursday that unlimited access to the model would cost $200 per month. In the video the company released to demonstrate the model’s strengths, a user uploads an image of a wooden birdhouse and asks the model for advice on how to build a similar one. The model “thinks” for a moment and then spits out what appear to be comprehensive instructions on the surface.

Upon closer inspection, it turns out that the instructions are almost useless. The AI ​​measures the amount of paint, glue, and sealant needed for the job in inches. Only the dimensions for the front panel of the birdhouse are given, others are not. It is recommended to cut a piece of sandpaper to different dimensions for no apparent reason. And in a separate part of the instructions list it says, “The exact measurements are as follows…” and then doesn’t give exact measurements.

The AI ​​assistant only offers measurements for a wooden panel. It also measures the amounts of paint, glue, and sealant needed in inches if all liquids are involved.OpenAI, via X

“You would know as much about building the birdhouse from the image as you would from the text, which kind of defeats the whole purpose of the AI ​​tool,” says James Filus, director of the Institute of Carpenters, a U.K.-based body, in a statement E-mail. He notes that the materials list includes nails, but the list of tools needed does not include a hammer, and that the cost of building the simple birdhouse would be “nowhere near” the $20-50 estimated by o1 . “Just saying ‘add a little hinge’ doesn’t really cover perhaps the most complex part of the design,” he adds, referring to another part of the video that aims to explain how to add an opening roof to the birdhouse .

OpenAI did not immediately respond to a request for comment.

It’s just the latest example of an AI product demo doing the opposite of its intended purpose. Last year, a Google ad for an AI-powered search tool falsely said the James Webb Telescope had made a discovery it hadn’t, a gaffe that sent the company’s stock price plummeting. Recently, an updated version of a similar Google tool told early users that it was safe to eat rocks and that they could stick cheese on their pizza with glue.

The AI ​​assistant says, “The exact dimensions are as follows” and then doesn’t provide any dimensions.OpenAI, via X

OpenAI’s o1, which is the best-performing model to date according to public benchmarks, takes a different approach to answering questions than ChatGPT. Essentially, it’s still a very advanced next word predictor, trained by machine learning on billions of words of text from the internet and beyond. But instead of immediately spitting out words in response to a prompt, it uses a technique called “thought chain” reasoning to essentially “think” about an answer behind the scenes for a while and then only gives its answer afterward. This technique often produces more accurate answers than a model reflexively spitting out an answer, and OpenAI has praised o1’s reasoning abilities – particularly when it comes to math and coding. It can accurately answer 78% of doctoral-level scientific questions, according to data OpenAI released along with a preview version of the model released in September.

But obviously some basic logical errors can still slip through.

Leave a Reply

Your email address will not be published. Required fields are marked *