Podcast

AI podcast video editing,
driven by the transcript

A podcast edit is mostly listening. You find the right moment, name the speaker, tighten the silences and level the mix. EditAssist does that work for you. It transcribes every word, separates every voice and cuts from meaning, then masters to spec and pulls the clips, all on your machine.

01

The podcast workflow

Drop in the camera files and the clean recorder mix. The agent ingests everything, transcribes the dialogue, and runs speaker diarization so the timeline already knows who is talking. From there you work in plain English: ask for a rough cut from the transcript, tell it to drop the tangents, and it assembles the episode around the answers that matter.

Then it tightens. Silences over a threshold are removed with crossfades, the cameras are auto-synced into a multicam sequence, and the mix is normalised to your loudness target with music ducked under the voice. When the long-form cut is locked, it pulls the strongest moments and reframes them to 9:16 for social. One episode gives you every deliverable.

02

Prompts for podcast editors

  • Build a rough cut from the transcript: drop every tangent and keep the answers about the launch.

  • Identify every speaker in this four-guest roundtable and label the timeline markers with their names.

  • Detect silence longer than two seconds and remove it with short crossfades.

  • Auto-sync the two ISO cameras and the clean mix, then flag any audio drift.

  • Normalise the full episode to -23 LUFS and duck the music under the dialogue.

  • Find the three best soundbites under 30 seconds and cut them as vertical clips for Reels.

  • Generate chapter markers with titles at each topic change and export them for YouTube.

  • Find every place a guest says 'um' or 'uh' and remove them automatically.

Hundreds more in the prompt library.

03

What powers a podcast edit

Speaker diarization & identification

Diarization and voice embeddings separate every speaker, then link them to named profiles. The agent labels who is talking and when, right across the episode.

Transcript-driven cutting

Whisper transcribes every word in 99 languages. You edit by meaning rather than waveform: keep the answers about funding, drop the small talk, and the agent rebuilds the timeline.

Loudness & auto-duck

Sync sound to picture, normalise to broadcast or streaming targets such as -23 LUFS, check the mix against spec, and duck music automatically under the voice.

Multi-cam auto-sync

Two cameras or six, plus a separate clean recorder, all auto-synced from audio and clap points into one multicam sequence, ready to cut.

04

Your guests stay private, and it's cheap to run

Transcription, diarization, search indexing and clip selection all run on your own machine. Nothing about your recordings is uploaded. The only thing that leaves is the text of your conversation with the agent. EditAssist is free to download, the local models are free, and new accounts start with £15 of credit and no card. Cloud models are pay-as-you-go, or £20/mo Starter and £100/mo Pro if you want included credit and perks.

05

Getting started

  1. 01Download EditAssist for macOS or Windows and create an account. You get £15 credit and no card is needed.
  2. 02Point it at your episode folder, and it ingests, transcribes and diarizes automatically.
  3. 03Ask for a transcript-driven rough cut, then refine in plain English.
  4. 04Normalise loudness, lock the cut and pull social clips, all in one session.

More workflows: YouTube creators, documentary editing, social clips. Or see everything EditAssist connects to and browse the prompt library.