OpenAI Found That AI Models Can Have Different Personas

There is a reason why your friends, teachers, and the people you surround yourself with in life matter. It is because who you spend time with can influence who you are. But as it turns out, that same logic applies to AI, too. According to a recent study by OpenAI, AI models can develop personas of their own.

AI models with their own personas

The study examined an AI model’s internal representations, which determine how it responds to requests. However, during the survey, OpenAI’s researchers discovered patterns that lit up, like the neural pathways in our brain, when a model misbehaved. In turn, the OpenAI researchers discovered that AI models can develop its own personas, like being sarcastic, for example.

As it turns out, that this stems from being trained on “bad” data. OpenAI’s research is based on an earlier study from February. Researchers found that training an AI model using code that contains security vulnerabilities can cause the model to respond with harmful or hateful content. This occurs even when the user prompts it with something benign.

However, the good news is that OpenAI’s researchers discovered that they could actually steer the model back to its regular state. This is achieved by fine-tuning the AI’s internal representations based on “good” or “true” information. Granted, the findings are alarming. Knowing that there are AI models out there that could be potentially trained on bad data to generate a false narrative is scary. However, the good news is that this is fixable.

According to Tejal Patwardhan, an OpenAI scientist who was part of the study, “To me, this is the most exciting part. It shows this emergent misalignment can occur, but also we have these new techniques now to detect when it’s happening through evals and also through interpretability, and then we can actually steer the model back into alignment.”

The importance of regulation

These findings are a good example of why AI needs to be better regulated. Companies like OpenAI envision a future where ChatGPT could be our personal daily assistant. This is why rules and regulations need to ensure that we aren’t interacting with bad actors who feed us misinformation.

Right now, the Trump administration has proposed a 10-year moratorium on state-level regulation. This means that any regulation regarding AI can only be done on a federal level. Obviously, laws that states make can become federal. But putting a hold on state-level regulation in the name of progress certainly has its risks.

The post OpenAI Found That AI Models Can Have Different Personas appeared first on Android Headlines.

Related Stories

It Looks Like a Bunch of Changes Are Coming to Gboard

Gemini Space is coming to Pixel phones, and it could be Google’s take on Samsung’s Now Bar

The secondhand Switch 2 market is already a minefield of bricked consoles