October 31, 2024
Over time, I became a bit lazy with writing and got into the habit of using voice input on my Android smartphone. Unfortunately, this feature doesn’t work as easily under the operating systems Ubuntu and Windows. After some experimentation, I’ve now found an alternative solution that allows me to use Google’s voice input through the Google Chrome browser.
Step 1: Installing Google Chrome: First, you need to install the Google Chrome browser on your system. For Windows, this is straightforward, while it may take a few more steps to install Chrome on Ubuntu. It’s important to enable the microphone in the browser, as this is the foundation for the voice input feature.
Step 2: Using Google Docs as a Tool for Voice Input: Another crucial step is to use Google Docs, which is opened through Google Chrome. It should be noted that a Google account is required for this application, as signing in to the Chrome browser is mandatory. This could be a disadvantage for Ubuntu users, as many of them consciously seek independence and privacy. However, this is a prerequisite to fully utilize Google’s voice input feature.
Step 3: Structuring and Editing Text: Google Chrome’s voice input works surprisingly well, allowing for a quick transformation of spoken words into text. However, the raw text is often messy and requires careful editing and structuring. This is where another helpful function comes into play: you can copy the text from Google Docs using „Copy and Paste“ and paste it into an editing tool like ChatGPT, which is well-suited for structuring and editing texts.
A Solution with Minor Compromises: Although this method provides a reliable and functional voice input solution on Ubuntu and Windows, it requires using a Google account and installing the Google Chrome browser. Those who are willing to accept these minor limitations will find a simple and effective solution for implementing voice input on the desktop.
By the way, I dictated this text as well and had it edited for better readability.
Installing Google Chrome on Ubuntu: Here’s a simple method for installing Google Chrome on Ubuntu via the terminal. This method uses `wget` to download the installation file and `apt` to install it.
1. Download the Installation Package: Open the terminal (Ctrl + Alt + T) and download the latest version of Google Chrome directly from the Google website by entering the following command:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
2. Install Google Chrome: After the download is complete, install the package with the following command:
sudo apt install ./google-chrome-stable_current_amd64.deb
3. Launching Google Chrome: After a successful installation, you can launch Google Chrome directly from the application menu or by entering `google-chrome` in the terminal.
Note: This method ensures that all required dependencies are automatically installed and that Google Chrome runs optimally.
Installing Google Chrome on Windows: Visit the official Chrome site and follow the installation instructions.
Conclusion – Efficient Text Creation with Google Voice Input and ChatGPT: The combination of Google Voice Input and ChatGPT can save significant time. By using voice input, ideas, thoughts, and content can be directly transformed into text—without the hassle of typing. After dictating, the raw texts can easily be edited in ChatGPT to improve structure, add paragraphs, and optimize the results. This workflow offers impressive efficiency, especially for users who create a lot of text or want to capture their thoughts more quickly.
For the ChatGPT website, it would be logical to also offer a voice input feature, just as the Android app successfully implements. This would allow desktop users to dictate their thoughts directly and immediately continue working with them in ChatGPT. Whether and when this feature will be available has not yet been officially announced—but given the popularity and demand for voice-controlled applications, it would be a sensible expansion. There is already a Chrome plugin for Chrome and ChatGPT, which is introduced below.
Chrome Plugin „VoiceWave: ChatGPT Voice Control“: „VoiceWave: ChatGPT Voice Control“ is a Chrome plugin specifically developed for voice control of ChatGPT. With it, you can dictate questions or texts and even have ChatGPT read the answers aloud—ideal for hands-free work or simply enjoying a more relaxed conversation with the AI.
Main Features of VoiceWave:
-
-
- Voice Input: It allows you to speak directly into the chat box, converting your words into text and inserting them into ChatGPT.
- Voice Output: The plugin reads ChatGPT’s responses aloud, which can be particularly handy for longer answers.
- Adjustable Languages and Voices: You can choose different voices and languages for the voice output, tailored to your preferences.
-
Usage Tips:
-
-
- Allow Microphone Access: Upon first start, you’ll be asked if the extension can access your microphone—this must be confirmed for voice input to work.
- Enable/Disable Dictation Mode: The plugin usually has a small icon in the browser bar that allows you to activate or pause voice input.
- Speak Slowly and Clearly: For more complex words or specific technical terms, it can help to speak a bit slower and clearer.
-
I am very impressed with these experiments with VoiceWave, which genuinely open up new possibilities and perspectives. I enjoy using the new plugin „VoiceWave: ChatGPT Voice Control“ in Google Chrome on Ubuntu, and based on my experiences so far, it works excellently.
I can now speak directly into ChatGPT and have the texts revised with the help of this plugin. If I want, I can also have the revised texts read to me with a pleasant voice output. Everything works wonderfully!
I use a headset and a microphone, and the clarity of speech seems very good for speech recognition. This new way of interacting allows me to work more efficiently while taking advantage of voice control.
Outlook for the Near Future: Thanks to this and similar plugins that work on both Ubuntu and Windows, I have conducted extensive experiments with voice control. I connected a microphone to my stereo system, significantly simplifying usage. Now, it’s possible to enter texts for emails almost effortlessly while I simply speak. The software converts my speech into well-structured written German or a more casual tone, depending on the need.
I imagine that there will soon be plugins for email programs like Thunderbird that offer this functionality. This development could significantly change the way we communicate. The question arises of how these technologies will influence our interpersonal relationships and what potential negative effects they may have on our social behavior. As more people use these technologies, the nature of communication may become shallower, and personal exchanges could take a back seat.
Another aspect that should not be overlooked is the future of search engines. With the rise of AI-powered communication tools, traditional search engines could lose significant importance as people increasingly seek direct answers and solutions through intelligent systems.
Overall, this technological development presents both exciting opportunities and challenges that we should keep an eye on.
By the way, ChatGPT can already be integrated into WordPress to generate content even more conveniently. Unfortunately, this service is not free, so I couldn’t test it.