The Mysterious Multilingual Thinking of OpenAI's AI Model

Jimmy Jing

14 Jan 2025 — 2 min read

Shortly after OpenAI launched its first "reasoning" AI model, o1, users observed an intriguing pattern. The model sometimes "thinks" in languages such as Chinese or Persian, even when posed with questions in English. This curious tendency surfaced immediately following its release.

Multilingual Reasoning in o1

Given a problem like "How many R's are in the word 'strawberry'?" o1 initiates a thought process through numerous reasoning steps. The model's ultimate response would appear in English if the query was in English, but intriguingly, several steps might occur in a different language before reaching the conclusion.

"[O1] randomly started thinking in Chinese halfway through," commented one user on Reddit.

"Why did [o1] randomly start thinking in Chinese?" questioned another user on X, despite the conversation containing no Chinese elements.

Expert Theories

OpenAI has yet to address this phenomenon or provide an explanation. Nevertheless, AI experts propose several theories.

Clément Delangue, CEO of Hugging Face, highlighted that models like o1 are trained on datasets that include substantial Chinese character content.
Ted Xiao of Google DeepMind noted that companies like OpenAI often use Chinese data labeling services, which might influence o1’s reasoning language.

Linguistic Bias and Its Implications

Labels guide models in understanding data. They can also introduce biases, just as imbalanced labels have affected toxicity detectors to disproportionately mark African-American Vernacular English as toxic.

Some experts contest the hypothesis that data labeling alone explains the multilingual behavior. These experts argue that o1 can switch to other languages like Hindi or Thai while problem-solving.

Matthew Guzdial, AI researcher, remarked, "The model doesn't know what language is, or that languages are different. It's all just text to it."

Reasoning models utilize tokens—words, syllables, or characters—to process text, allowing potential bias introduction, similar to assumptions in word-to-token translation regarding sentence spaces.

Hugging Face engineer Tiezhen Wang suggests that the language inconsistencies might reflect the model's training-induced associations.

Wang wrote, "By embracing every linguistic nuance, we allow the model to learn from the full spectrum of human knowledge."

Challenges in AI Transparency

Luca Soldaini from the Allen Institute for AI advises caution, highlighting the opaqueness of AI systems, which makes verifying such patterns challenging. "Transparency in AI systems is crucial," Soldaini asserted.

Absent a statement from OpenAI, the reasons behind o1’s language shifts remain speculative, whether it sings French songs or contemplates synthetic biology in Mandarin.

Will RedNote get banned in the US?

There’s a certain irony in the recent wave of TikTok users transitioning to RedNote. Originally, the clause to either divest or ban TikTok was aimed at curbing the influence of foreign-owned social networks potentially susceptible to the Chinese government's control. However, the move has unintentionally led users

Apple Temporarily Suspends Notification Summaries in iOS 18.3 Beta

Apple has announced a temporary suspension of the notification summaries feature for news and entertainment applications in its latest iOS 18.3 developer beta. This decision was confirmed following reports highlighting inaccuracies in content summarizations, which stemmed from the Apple Intelligence platform. In response to criticisms over these inaccuracies, particularly

Biden punts the TikTok ban to Trump

The Biden administration has declared that it will defer dealing with the controversy surrounding the TikTok ban to President Donald Trump, who will be stepping into office shortly. A White House official clarified, “Our position on this has been clear: TikTok should continue to operate under American ownership. Given the

Sony Launches Black PlayStation 5 Accessories for Preorder

In conjunction with the recent CES event, Sony has unveiled a selection of black PlayStation 5 accessories, which are now open for preorder before they hit the shelves on February 20th. The new lineup features a variety of items, including the DualSense Edge controller priced at $199.99, the Pulse