
As much as we’d hate to admit it, there’s a distinct possibility that AI could one day take our jobs. We’re already seeing this happen, especially in the graphics space, where users can easily generate a professional-enough image with a few simple prompts. But how soon can we expect AI to truly replace us in the workspace? That’s something OpenAI set out to discover in a recent benchmark to see how well its GPT-5 model does human jobs.
OpenAI benchmarks GPT-5 against human jobs
This new benchmark is called GDPval. According to OpenAI, it measures AI models like GPT-5 on tasks a human might do at their jobs. “It measures model performance on tasks drawn directly from the real-world knowledge work of experienced professionals across a wide range of occupations and sectors, providing a clearer picture of how models perform on economically valuable tasks.”
Currently, GDPval is based on nine industries that contribute the most to America’s GDP. This includes healthcare, finance, manufacturing, and government, just to name a few. In one of the tests, OpenAI asked industry professionals to compare reports generated by AI and those by other professionals. It also asked investment bankers to create a competitor landscape for the last-mile delivery industry and compare it to AI-generated reports.
Surprisingly and somewhat worryingly, OpenAI’s GPT-5 model performed the best out of all the company’s models. OpenAI found that the work generated by GPT-5 was either ranked better or on par with industry experts 40.6% of the time. The company also took a look at its competitor, Anthropic, and its Claude AI model. Claude seemed to perform better with a 49% win rate. However, OpenAI thinks it’s because Claude is better at making “pleasing graphics.”
Will this replace humans at their jobs?
Like we said, there is a possibility that some jobs might eventually go the way of the dinosaur. However, for now, it seems that we are in a transitionary sort of period. Speaking to TechCrunch, OpenAI’s chief economist, Dr. Aaron Chatterji, suggests that based on the GDPval results, it’s not about AI replacing humans. But rather, it’s about humans leveraging AI to free up time for more meaningful tasks.
For instance, your job may require you to type up reports based on data. Instead of spending hours formatting everything, AI can get the job done for you in minutes. This would free up time for you to spend on other tasks at work or even personal ones. Sounds like a fair trade-off.
The post OpenAI Tests GPT-5 on Human Jobs: Benchmark Shows AI Matching Experts appeared first on Android Headlines.