Voice of the Machine: Responding with Newfound Authority
Voice of the Machine: Responding with Newfound Authority
ChatGPT is set to become an interactive generative AI experience. OpenAI revealed that the world’s leading AI chatbot will be able to speak and respond to user queries using a synthesized, presumably AI-generated, voice.
MUO VIDEO OF THE DAY
SCROLL TO CONTINUE WITH CONTENT
Along with its newfound voice, ChatGPT will also be able to respond to and discuss specific images uploaded to it or snapped while using the ChatGPT Android or iOS app. The image recognition feature sounds similar to Google Lens and other apps that use neural networks to detect data and information accurately.
Disclaimer: This post includes affiliate links
If you click on a link and make a purchase, I may receive a commission at no extra cost to you.
OpenAI Gives ChatGPT a Voice
On September 25, 2023, ChatGPT developer OpenAI revealed it would give its world-leading generative AI chatbot a voice. ChatGPT users can speak directly to the chatbot and request it speak back, effectively allowing ChatGPT to converse directly with voice for the first time.
OpenAI’s example clip features a woman asking ChatGPT to create a unique bedtime story, to which ChatGPT duly responds with a female synthesized voice.
According to Wired , the new text-to-speech model was developed in-house. It can generate “human-like” audio from text and a few seconds of sample speech (using the OpenAI Whisper model ) and speak in various tones and styles. You can find a range of voice samples on OpenAI’s blog .
Some companies are already putting OpenAI’s new voice model to use. For example, Spotify is using OpenAI’s text-to-speech model to translate podcasts into different languages, combining ChatGPT’s language translation prowess with its new speaking ability.
ChatGPT’s new text-to-speech model is only available to Plus and Enterprise subscribers using the official Android and iOS apps and is expected to roll out within the next two weeks (starting from September 25, 2023). Furthermore, the new voice feature is limited to English to begin with, though we would expect this to change rapidly.
ChatGPT Can Recognize and Analyze Images and Photographs
The second part of OpenAI’s ChatGPT update is the ability to analyze and talk images uploaded to the tool. The visual image analysis option was featured in the GPT-4 update videos but hasn’t been discussed much since that time (ChatGPT Code Interpreter aside ).
Now, ChatGPT gains functionality similar to Google Lens. You can upload an image to ChatGPT or take a photograph using your smartphone camera in the ChatGPT app, and it will detail the image, adding more context where required.
Calling it “similar to Google Lens” does it an injustice, really. The ability to chat back and forth about the image to gain more information and context makes it extremely useful for a broad range of settings. However, it’s important to note the fine print, with OpenAI making it clear that it has limited ChatGPT’s “ability to analyze and make direct statements about people” for privacy and accuracy reasons. Still, could an OpenAI-powered “Who Is This” tool be in the works for the future? (Let’s hope not!)
Like the new text-to-speech model, OpenAI will roll out image recognition in the next two weeks, though it will be available on all platforms, not just the ChatGPT app.
Privacy, Security, and Other Issues
The implications of a voice-powered ChatGPT are stark. Sure, it’s exciting. However, the ability to create a uniquely synthesized voice using just a short snippet as an example has considerable privacy and security issues. The potential for malicious actors to exploit these tools is enormous, and as with any generative AI tool, once the genie is out of the bottle, it absolutely will not go back in. No amount of AI regulation from governments or thought leaders can turn back the tide.
Even OpenAI’s warning on the topic seems to skirt around the obvious despite mentioning the issues:
However, these capabilities also present new risks, such as the potential for malicious actors to impersonate public figures or commit fraud. This is why we are using this technology to power a specific use case—voice chat.
Given this is the tip of the iceberg, expect pushback against ChatGPT’s newfound voice, especially once there is a predictable uptick in unsavory headlines claiming ChatGPT is being used to commit fraud and so on.
OpenAI Is Making ChatGPT the Go-To AI App
The more OpenAI adds user-friendly features to ChatGPT, the more it becomes the go-to generative AI app. As the first to reach widespread fame during the initial generative AI boom, ChatGPT still leads the way and is the only app some use, despite competition from the likes of Google Bard (and potentially Google Gemini) and Anthropic’s Claude.
So long as OpenAI can continue to add features that make ChatGPT easier to use, it’ll keep people hooked and push ever closer to its goal of a truly multi-modal AI tool.
SCROLL TO CONTINUE WITH CONTENT
Along with its newfound voice, ChatGPT will also be able to respond to and discuss specific images uploaded to it or snapped while using the ChatGPT Android or iOS app. The image recognition feature sounds similar to Google Lens and other apps that use neural networks to detect data and information accurately.
Also read:
- [New] Mastering DiscoNitro The Ultimate Guide for Free/Paid Entry for 2024
- [Updated] 2024 Approved Crafting Perfect Thumbnails for Higher Clickthrough Rates
- [Updated] 2024 Approved Sound System Advances Announced
- 2024 Approved High-Impact Text Visual Effects
- 網上無需金錢,從 RMVB 改為 MOV - Movavi 高效格式更新
- Convert Video Voice to Text Online for Free Fast and Easy Ways
- Effortless Conversion of ARW to JPEG Format on Windows and Mac - A Step-by-Step Guide
- Free Online AVI to MPEG Converter by Movavi - No Downloads Required
- Gestalte Deine Fotos Mit Professionellen Filtern & Effekten in Der Movavi Photo-Editor App
- In 2024, Become a Trendsetter in SnapChat Innovate with Dynamic Boomers
- In 2024, Complete Tutorial to Use GPS Joystick to Fake GPS Location On Nokia C12 Plus | Dr.fone
- In 2024, Premier Auto-Track Cam Mount for Smooth Shots
- Online Free MPG Video Converter - Supports 3GP, GIF, MOV Formats - Movavi
- Top 7 Ultimate Screen Recording Tools for Enhancing Your Roblox Gaming Sessions with Movavi
- Transformación De Imágenes JPEG a PNG Libre Y Descargable en Internet
- Ultimate Solutions for Correcting NTFS Errors in Windows 11 Operating Systems
- Versie WMA Naar AIFF Met Algemene Vrije Lichamen: Betalenlijk Echtzeit-ConverteerApp - Movavi
- Title: Voice of the Machine: Responding with Newfound Authority
- Author: Brian
- Created at : 2024-11-25 16:48:53
- Updated at : 2024-11-27 16:00:02
- Link: https://tech-savvy.techidaily.com/voice-of-the-machine-responding-with-newfound-authority/
- License: This work is licensed under CC BY-NC-SA 4.0.