NewsI had a big problem with audio transcription - Gemini solved it,...

I had a big problem with audio transcription – Gemini solved it, but ChatGPT didn’t

You know what they say: “It’s no contest!” » Yes, do not let them lie to you; Everything is a competition, especially when it comes to artificial intelligence. Not a day goes by that I don’t put the AI ​​capabilities of various chatbots to the test, and the results almost always surprise me. Some platforms are actually better than others, at least for some tasks.

This journey started with notes on my iPhone 17 Pro Max. I usually record interviews on an Android smartphone like the Google Pixel 10 Pro Fold, where the amazing Recorder app expertly records every utterance and intelligently separates and tags each speaker in the transcript.

Gemini 3 Pro puts on the gloves

Over the past few months, I’ve been impressed by the capabilities of the Google Gemini, especially the latest 3 Pro models, and the confidence with which it seems to handle almost any quick query.

Now that I had the idea, I had to figure out how to get Gemini to listen to the recording. Playing audio through my iPhone speakers and listening to the Gemini wasn’t possible because I was concerned about the ability of my desktop microphones to pick up audio from the iPhone speakers, for example. Also, I was in the office and didn’t want people to hear the private conversation (until I posted a story).

First I discovered that you can download the audio file from Notes. During playback, under the three dots, there is a share button that allows me to transfer the audio file to my 14-inch MacBook Pro. It is available as an MPEG-4 (M4A) file.

Back on the Gemini 3 Pro, I selected the “+” sign in the message box, selected the M4A audio file, and added this short message: “Listen to this, transcribe it and make sure you identify the different speakers.”

There was no back and forth. The Germini 3 Pro soon began spitting out the entire transcript, identifying the speakers as “interviewers” and the name and title of my subject. It’s worth noting that this is the one thing that the Gemini 3 Pro got completely wrong for some inexplicable reason. Although my interlocutor mentioned his name at the end of the conversation, the Gemini chose something else. Otherwise, the twins felt perfectly when I or a person was talking. And the precision was really impressive.

For completeness, I asked Gemini 3 Pro to correct my subject ID and list me as “interviewer”. Once that was resolved, I happily used the transcription to continue my full story.

In this corner, ChatGPT

But of course I was curious if ChatGPT 5.1 (with a Plus account) could do the same job.

In the ChatGPT message box, I selected the audio file and typed the exact same message. ChatGPT told me “I can certainly transcribe the audio, but I can’t open or play the .m4a file directly from the specified location.”

What followed was a long back and forth, with ChatGPT repeatedly suggesting different ways to download the file, including converting it to a ZIP file. No matter what I did, ChatGPT displayed the audio file in the command prompt window, but I couldn’t listen to it.

In this little contest, the Gemini 3 Pro seems to be the winner, turning a frustrating problem into an easy win. The less said about the worthlessness of Apple Notes transcription, the better.

More From NewForTech

Cisco email security products that actively target a zero-day campaign.

Cisco Confirms Zero Days (CVE-2025-20393) in Secure Messaging Devices...

Do you use Kohler poop rooms? First, check these important privacy settings.

In October, Kohler launched Dekoda, a toilet-mounted camera that...

Fortnite Winterfest 2025 skins: All free and paid sets currently available, including Harry Potter and more.

Fortnite Winterfest 2025 skins have arrived, adding Hogwarts-themed clothing...