Skip to content
Contents

zoom transcription

Making your Zoom recording library searchable

How to turn a years-old archive of Zoom and Teams recordings into a readable, searchable, reusable resource — without losing the speaker labels.

11 min read 2,497 words By CleanScribe Editorial

If you have a folder of Zoom recordings going back two years and nobody — not even you — has opened a single one since the day it was recorded, you know the problem this guide solves.

Most teams record meetings with the best of intentions. The Q2 planning call, the client kick-off, the retrospective where everyone finally said what they actually thought — those are in there somewhere. But "somewhere in a Zoom recording" is not the same as "findable." Without a transcript, every recording is a black box. You know it exists. You have no idea what is in it without pressing play and watching the whole thing again.

This guide is about changing that. It covers what to configure before your next meeting so future recordings are usable, how to download the right file from Zoom's recording vault, which zoom transcription approach makes sense depending on your situation, and how to actually build a folder of polished transcripts you can search in ten seconds. It is written for anyone who runs meetings and has ever wanted to find something that was said in one of them.

Why a Zoom archive becomes write-only

Three structural problems compound, and each one is fixable once you understand it.

Cloud recordings live in Zoom's UI — buried and unsearchable. Log into the Zoom web portal, click Recordings, and you will see a paginated list sorted by date. There is no full-text search across the recordings themselves. If Zoom generated a transcript for your meeting (available on paid plans), the transcript quality is variable and the speaker labels are based on Zoom's active-speaker detection — a system that works reasonably well when people speak in clean turns, and falls apart the moment two people talk over each other or someone unmutes late. "Speaker 1 said the budget was approved" becomes an ambiguous artifact. Which Speaker 1? There were four.

Local recordings are MP4 files — even harder. When you record locally rather than to the cloud, you end up with a folder of video files named zoom_0. There is no transcript at all unless you ran the file through a transcription tool yourself. Most people do not. The files accumulate. They take up space. Nobody opens them. The default save location — Documents/Zoom/<date>/ on most machines — is not somewhere anyone browses for information on a deadline.

The naming problem makes everything worse. zoom_0023.mp4 tells you nothing about the meeting's content, participants, or outcome. Files get renamed inconsistently or not at all — sometimes the meeting title gets used, sometimes a date, sometimes nothing. Even when a recording is well-labelled, a folder of fifty MP4 files with dates and project names is not searchable in any meaningful sense. You still have to play the recording to find the sentence you need.

Result: a write-only archive. Recordings exist but contribute zero value past the day they were recorded.

Before the meeting

Most of the archive problem is solvable before you hit record. Three settings and habits do most of the work — and once they are in place, you barely think about them again.

Recording defaults

For cloud recordings, go to your Zoom account settings: Settings → Recording → Cloud recording, and make sure both "Cloud recording" and "Audio transcript" are enabled. The audio transcript toggle is a separate checkbox from the recording itself — many paid accounts have cloud recording on but have never turned on the transcript. It takes one click and applies to every future meeting.

If your team uses Microsoft Teams, the equivalent path is through the meeting policy in the Teams admin centre: Meetings → Meeting policies → Recording & transcription → Transcription. Tick it on, and Teams will produce a transcript alongside every recording by default.

This does not give you polished, named-speaker transcripts — for that, see the transcription approach section below — but it does give you a searchable raw text baseline for every meeting going forward, which is far better than nothing.

Naming the meeting itself

Future-you searches the meeting title before opening the file. 2026-05-11 Q2 planning — Andrei, Maya, finance team beats Catch-up in every possible scenario. When you are on a deadline at 4pm and need to find the sentence where someone confirmed the go-live date, you will search for "Q2 planning" before you think to filter by date. If every meeting in your calendar is called Catch-up or Sync, you will be watching recordings for twenty minutes instead of reading a transcript for two.

The habit to build: any calendar invite that will produce a recording should have a title that includes the date, the main topic, and at least the key attendees. You can always shorten it in the display name — the important thing is that the string the Zoom file inherits is specific enough to be findable.

Asking attendees to be on video and unmuted

Zoom's built-in speaker labelling depends on the active-speaker indicator — the orange ring that highlights whoever is speaking at that moment. This system fails when participants are muted (it cannot distinguish who unmuted for two seconds) and when multiple people speak at once. Even with AI-based zoom transcription tools that use voice diarization, video helps anchor voices to participants during the first few minutes of the recording, when the model is still building a voice profile for each person.

It does not need to be a rule. A short note in the meeting invite — "we'll be recording, so please have video on if possible" — is enough for most recurring meetings. The quality difference in the resulting transcript is noticeable.

Downloading the right file

Zoom gives you several download options, and the right one depends on what you are doing with it.

Cloud recordings live in the Zoom web portal: log in at zoom.us, go to Recordings → Cloud Recordings, find the meeting, and click the download icon. You will see options including the video file (MP4), the audio-only file (M4A), and sometimes a chat transcript (TXT). For transcription purposes, download the audio-only M4A. It is substantially smaller than the MP4 — a one-hour meeting might be 50 MB as M4A versus 600 MB as MP4 — and a transcription tool does not need the video track. Smaller file means faster upload and lower chance of a timeout.

Local recordings live on your machine in Documents/Zoom/<date>/ by default on both Mac and Windows. Each meeting gets its own subfolder with the video file, an audio-only M4A if you enabled that in settings, and any chat log. Again, grab the M4A if it exists.

Audio-only is enough for transcription in almost every case. The one exception is if you need to reference what was on screen during a coding walkthrough or design review — in that case, keep the MP4 handy for context, but still use the M4A for the transcription itself.

If the file is over 5 GB, most zoom transcription tools will struggle with the upload or time out. This typically only happens with very long meetings recorded in HD video. The fix is to convert to a compressed M4A before uploading:

ffmpeg -i input.mp4 -vn -c:a aac -b:a 64k output.m4a

This strips the video and compresses the audio to 64kbps — more than enough quality for speech — and will reduce a 5 GB video to something in the 200–400 MB range. ffmpeg is free and available for Mac, Windows, and Linux.

One practical note: Zoom cloud recordings expire. The default retention period on most plans is 30 days, after which the recording is deleted automatically unless you have paid for extended storage. Download recordings within a week of the meeting, not months later.

Choosing a transcription approach

You have three real options for zoom transcription, with different quality, cost, and cleanup-time tradeoffs. None of them is universally right.

Option 1: Zoom's built-in transcript

Available on paid Zoom plans at no extra cost, and it runs automatically if you enabled it in settings. The quality is good enough for a quick scan — it will catch most of what was said — but the speaker labels are based on Zoom's active-speaker detection, which means they work well for structured Q&A (one person speaks at a time, clearly identified) and break down for any meeting with crosstalk, people joining late, or participants who forget to unmute before speaking.

The labels come out as display names from the Zoom account, which is better than Speaker 1 — but only as accurate as whether everyone used their real name in their Zoom profile. Guests and external participants often show up as their account email prefix, not their name.

If your goal is just "did anything happen around minute 30," Zoom's built-in is fine. For any use case where the speaker attribution matters — action items, decisions, accountability — you will want something more reliable.

Option 2: AI transcription with speaker diarization

Tools like Otter.ai, Notta, and the entry-level plans from Rev will take your audio file and return a transcript with speaker labels — Speaker 1, Speaker 2, and so on. The word accuracy is meaningfully better than Zoom's built-in on most recordings. The diarization (speaker separation) works well for two or three people and degrades with each additional participant.

For a five-person meeting, expect to spend 20–40 minutes manually relabelling speakers before the transcript is usable for anything you would share with another person or cite in a document. For a two-person one-on-one, this approach is genuinely fine — the cleanup is under ten minutes and the cost is low, usually $8 to $20 per recorded hour on consumer plans.

The honest picture: if you are running large team meetings and want to archive dozens of recordings, the relabelling burden adds up fast.

Option 3: AI transcription with named-speaker labelling

Some newer tools can match voices to names rather than assigning them numbers. You supply the participant names at upload — either by pasting them from your meeting invite list, or by letting the tool pick up spoken introductions from the recording itself — and the transcript comes back with actual names on every line.

The cleanup time drops from 20–40 minutes per recording to about five minutes. You are still doing a QA pass, but it is verification rather than reconstruction.

This is what CleanScribe does. You can pre-fill names at upload from your attendee list, or if participants introduce themselves in the first few minutes of the recording the tool will anchor those names automatically.

If you want to try it on a meeting you already have: the free tier is 120 minutes per month, no credit card. That is two or three Zoom calls before you decide whether to upgrade.

Building a searchable archive

Getting a good transcript out of a meeting is half the work. The other half is making sure you can find it again six months from now. This is the part that most people skip, and it is also the part that makes the difference between a transcript being genuinely useful and being just another file you know exists somewhere.

One folder, one format. Create a single folder for all your meeting transcripts — ~/Documents/Meeting Transcripts/, or wherever makes sense in your system — and put every polished transcript there. Name each file YYYY-MM-DD short-title.md so they sort chronologically: 2026-05-11 Q2 planning.md, 2026-04-22 client kick-off — Meridian.md. The date prefix means you always know the order without opening anything.

Full-text search across the lot. Once your transcripts are plain text files in a single folder, every search tool works on them. The Unix standby grep -ri "phrase" . finds any mention of a phrase across every transcript instantly. If you want something faster, ripgrep (rg "phrase") is faster still and handles large folders without slowing down. If you prefer a graphical interface, Obsidian treats a folder of Markdown files as a knowledge base with search built in — and the cross-linking feature means you can connect related meetings over time.

Cross-link related meetings. When a meeting references a decision from a previous one — "as we agreed in April" — link the two Markdown files. A single line at the bottom of the transcript, See also: [[2026-04-22 client kick-off — Meridian]], builds compounding context. Six months from now, when you are tracing the history of a decision, those links are the difference between finding it in two minutes and spending half an hour piecing it together from memory.

Keep the audio alongside. Transcripts are faster to read than audio is to scrub, but a transcript occasionally strips out something that only makes sense when you hear the tone — a joke that looked like a serious proposal in text, a decision that was clearly tentative from the speaker's hesitation, a quote where the emphasis changes the meaning. Keep the M4A in the same folder as the transcript, named with the same date and title. Storage is cheap; the audio is your source of truth for anything ambiguous.

A short checklist

Recording defaults (one-time setup):

  • Cloud recording + audio transcript enabled
  • Recording goes to a single named folder
  • Meeting titles include date + attendees + topic

Per meeting:

  • Attendees on video and unmuted when speaking
  • Active speaker indicator visible in the recording
  • Recording downloaded the same day, before the cloud expiry kicks in

Per transcript:

  • Speakers labelled by name (not Speaker 1/2/3)
  • Filename matches YYYY-MM-DD short-title.md
  • Stored in your single transcript folder, alongside the audio

Where CleanScribe fits

CleanScribe was built for long-form recordings with multiple speakers, where who said what matters as much as what was said. Three things we did differently:

  1. Named speakers, not numbers. When attendees introduce themselves on the recording, we use that name. You can also pre-fill names at upload from your meeting invite list.
  2. Long files in one pass. Up to 8 hours per upload, no splitting. A four-hour quarterly review goes in as one file and comes out as one transcript.
  3. Polished prose, not a recording in text. We strip the umms, the false starts, and the repeated half-sentences so the transcript reads as a conversation. The meaning stays; the noise goes. The original audio is still there if you want to listen back to a specific quote.

The free tier is 120 minutes per month. No credit card. Try it on one of your existing Zoom recordings, and compare what comes back to whatever you'd get from Zoom's built-in transcript.

Start free at cleanscribe.ai/for/meetings


Have a meeting-archive workflow tip we should add to this guide? Email us — we update this piece as new tools and techniques become standard.