OpenAI 'Safety Evaluation Hub' Promises to Be Transparent on Model Hallucinations, Harmful Content

There is no doubt that generative AI, no matter how advanced it is in this day and age, is still capable of hallucinations and sharing of negative content, but OpenAI promises to be more transparent now if it happens with its technologies.

There is a new resource called the “safety evaluations hub” that will now share information about its AI model’s hallucinations, as well as when it shares harmful content with the public.

This new safety hub gives users a chance to understand when the machine learning models from the company mess up, making it public knowledge when it makes mistakes.

Contents

OpenAI’s ‘Safety Evaluation Hub’ Promises Transparency
OpenAI to Report on Hallucinations, Harmful Content
Transparency in the World of Generative AI

OpenAI’s ‘Safety Evaluation Hub’ Promises Transparency

OpenAI has launched the new “safety evaluations hub,” which aims to evaluate its AI model developments regularly, and this will share information about the technology’s safety and performance with the public.

More importantly, this will be the new hub that will bring transparency about OpenAI’s developments to the public, as it will open a window to bring answers to users’ questions whenever the models have a problem.

Scrutiny is no longer new to OpenAI, but the company’s integrity and transparency were also previously questioned as they were not clear on what is happening behind the scenes of their AI models.

According to Engadget, OpenAI was once accused of “accidentally” deleting evidence behind the plagiarism case that the New York Times once filed against them.

Read Also:
OpenAI’s Latest ChatGPT AI Models Are Smarter, But They Hallucinate More Than Ever

OpenAI to Report on Hallucinations, Harmful Content

In the new safety evaluations hub, OpenAI will share reports on the different incidents where its AI models hallucinated or shared harmful content to the public.

The company will be sharing the many safety and performance issues that its models will come across, and hopefully, this will also serve as a reminder for them to fix the problems that occurred during its processes.

Apart from that, the company will also share information regarding attempted jailbreaks of its AI, as well as its behavior under the “Instruction Hierarchy” category.

Transparency in the World of Generative AI

Transparency among AI companies has been somewhat of a blur these past few months as many have faced scrutiny for not disclosing the content or materials which their models were trained from.

US Congress representatives have previously pushed towards passing a bill called the “AI Foundation Model Transparency Act,” which requires companies to disclose the copyrighted training data they used.

While many companies have moved towards partnering with certain media entities to access copyrighted content for AI model training, there are still other transparency issues that are raised against it.

Recently, OpenAI was questioned about its transparency on its latest benchmark scores for the o3 model, its upcoming reasoning AI model that features the more advanced capabilities among its LLMs. This was because OpenAI’s benchmark claimed a 25% score, but when tested by another company, they found that the o3 model can only answer 10% of FrontierMath problems.

OpenAI is only one of the many AI companies that face questions and concerns about their claims, operations, and developments, but this new safety evaluation hub brings better transparency to the public, especially when the models make mistakes.

Related Article:
FDA, OpenAI Reportedly Discussing the Use of AI for Drug Evaluations Under cderGPT Project

OpenAI ‘Safety Evaluation Hub’ Promises to Be Transparent on Model Hallucinations, Harmful Content

OpenAI’s ‘Safety Evaluation Hub’ Promises Transparency

OpenAI to Report on Hallucinations, Harmful Content

Transparency in the World of Generative AI