OpenAI Adds Voice and Image Capabilities to ChatGPT’s Repertoire

Q.ai — a Forbes Company
3 min readSep 27, 2023

Key Takeaways

  • OpenAI announced on Monday it will be adding voice prompt and response capabilities to ChatGPT
  • This update comes on the back of OpenAI’s announcement last week that ChatGPT will be linked to DALL-E 3, and be able to generate pictures
  • OpenAI intends to roll the new features out to ChatGPT’s Plus and Enterprise versions over the next two weeks

This Monday, OpenAI announced in a company blog post that it will be adding new features to its wildly popular chat bot, ChatGPT, that allow its users to feed it voice or image prompts — with the bot being able to respond in kind with human-like generated audio.

In its blog post, OpenAI added that it will not only be rolling out these features — but will also link the chatbot with DALL-E 3, its latest image-generating model, enabling the chatbot to also generate images.

The announcement comes in the midst of a heating-up generative AI battle between tech industry giants: Amazon announced the very same day its commitment to invest up to $4 billion in OpenAI rival Anthropic, and just last week, Google announced a series of updates to its ChatGPT rival, Bard.

What can users do with ChatGPT’s new voice capabilities?

ChatGPT’s new voice feature will allow its users to feed it prompts by voice, with the chatbot responding in kind using the voices of five established voice actors that OpenAI teamed up with.

This move brings ChatGPT’s functionality a step closer to Apple and Amazon’s popular AI assistants, Siri and Alexa, respectively. However, OpenAI hopes that the answers will be better than either of these virtual assistants thanks to its superior underlying tech. In fact, most virtual assistants are seemingly being rebuilt to utilize large language models, giving OpenAI an edge in this regard.

ChatGPT’s new voice feature also brings with it the ability to “request a bedtime story for your family, or settle a dinner table debate,” OpenAI said in its blog post. With just a few verbal prompts, users can guide ChatGPT to generate a story on the spot.

What about its new image recognition and generation features?

OpenAI’s announcement on Monday unveiled its plans to add a feature that lets the chatbot respond to prompts that include an image. In addition, OpenAI announced that ChatGPT and DALL-E 3 will be linked together to give the chatbot the ability to generate images.

For instance, users can take a picture of, say, a dish at a restaurant, and ask ChatGPT to come up with a recipe for it. It also plans to add the ability to ask the bot to hone in on a specific section of the image, by selecting it with its “drawing tool” feature.

Be My Eyes, a company behind an app for people with impaired vision, has trialed ChatGPT’s image recognition ability. While its app gives users the ability to enlist the help of human volunteers to identify photos, it recently partnered with OpenAI to offer its users the option of asking the chatbot instead.

The bottom line

The new features announced by OpenAI mark a step forward in the chatbot’s development; however, they don’t come without inherent risks and limitations. The company has acknowledged this, however — for instance, the risk of fraud and impersonation, and the AI generating false information about image prompts.

OpenAI has said that, as a countermeasure, it’s going to be “using this technology to power a specific use case,” and with regard to image hallucinations, it’s taken technical measures to limit ChatGPT’s ability to make direct statements about people.

With its recent announcements, OpenAI is signaling its intent to pull ahead in the generative AI battle, rolling the best of its AI models into a single smartphone app to dethrone the major virtual assistant apps and chatbots. Given the strong foundations upon which it’s building, it may well be headed in that direction.

--

--

Q.ai — a Forbes Company

We’re a team of investing gurus here to help you build wealth with eyes on your financial future. Check our AI-powered investing app, Q.ai, on iOS and Android.