Cut by timestamps (with separate clean audio)

Tutorial: you already know your cut — list the timestamp ranges in plain language and the agent stitches, syncs your clean audio, and finishes the edit.


Sometimes you don't need the agent's editorial judgment — you watched the footage, you know exactly which parts stay. Maybe you also recorded audio separately (a lav or USB mic) and want the polished track under the cut instead of the camera's audio. This tutorial gets you from "keep these ranges" to a finished, captioned cut in about 5 minutes: you supply the timestamps, the agent does the stitching, audio sync, and finishing.

What you'll make

  • The main track rebuilt from exactly the ranges you listed, in order
  • Your separately-recorded clean audio cut to the same ranges and kept in sync
  • Camera audio muted
  • Optionally compressed to a hard time budget, then captioned, reframed, and verified

Before you start

  • Import the video and (if you have one) the clean audio file into the Media panel. Name the audio something obvious like clean_audio.wav — the agent auto-detects a separate audio upload, and a clear name removes any guesswork.
  • Have your timestamps ready. They should be source times — positions in the original file, the ones you noted while watching it.

Step 1: One prompt with your ranges

Write the ranges in plain language — no special syntax needed:

Keep 0:00–0:35 and 0:46–0:54, and 4:23 to the end. Use my clean_audio.wav instead of the camera audio. Then captions and vertical.

What the agent actually does

  1. Stitches your ranges. keep_ranges converts your timestamps into a range list (internally a JSON array of {startSeconds, endSeconds} pairs via its rangesJson parameter — "4:23 to the end" becomes an open-ended last range) and rebuilds the main track from exactly those pieces, in order.
  2. Mirrors the cut onto your clean audio. By default (syncAudio: true), the same source-time ranges are stitched onto your separately-uploaded audio, so the clean track lines up with the video cut piece for piece. This works because you recorded both from the same take — the same source times describe the same moments.
  3. Kills the camera audio. mute_video_audio disables the embedded audio on the video clips so only your clean track plays.
  4. Finishes. Captions timed to speech, a vertical reframe if you asked for one, and the standard self-checks (critique_editverifyreview_edit) before it concludes. See How the agent checks its work.

What the timeline looks like after

Three video clips on the main track (your three ranges, back to back, no gaps), matching clean-audio clips on an audio track, captions above. The camera audio still exists inside the video clips — it's muted, not deleted, so you can bring it back.

Step 2: Hit a hard time budget

Your picks add up to 78 seconds but the slot is 60? Don't re-derive the ranges:

Compress this to 60 seconds.

compress_to_duration closes any gaps, then speeds clips up slightly — up to a safe 1.25× by default, below where retiming becomes audible — and only trims the tail if that still isn't enough. Your selected content survives; it just plays tighter.

Refinements

Adjust one boundary without redoing the cut:

Extend the second range — end it at 0:57 instead of 0:54, the sentence gets clipped.

Reorder:

Put the 4:23 section first — it's the strongest opener.

Video-only recut (you already fixed the audio and don't want it touched):

Recut the video to these ranges but leave the audio track alone.

That's keep_ranges with syncAudio: false.

Audio recovery. If you trimmed the video manually on the timeline and the clean audio has drifted out of step:

The clean audio is out of sync with my cuts — rebuild it to match the video track.

This is keep_ranges in syncAudioOnly mode: it mirrors the main track's current trims onto the audio without touching the video.

Make it yours

  • No separate audio? Just list the ranges — "keep 1:10–1:45 and 2:30–2:50" — and skip the audio instructions. The camera audio stays.
  • Timestamps from elsewhere. Ranges from a YouTube chapter list, a producer's notes doc, or a transcript review all work — paste them in whatever format you have and the agent parses them.
  • Delegate the last mile. After the stitch: "Now make it feel produced — captions, a quiet music bed, and a hook from something I actually say."
  • Repeat weekly? If you cut the same show format every week, save the run as a recipe and re-apply it — you'll only need to supply new timestamps.

Troubleshooting

The clean audio doesn't line up with the video. Range-mirroring assumes both files start at the same real-world moment. If you hit record on the audio recorder a few seconds before the camera, there's a constant offset — tell the agent: "the clean audio starts 2.5 seconds before the video — shift it and re-sync." For a visual check, compare the two waveforms on the timeline at a sharp sound like a clap.

The agent stitched the wrong ranges. Timeline time and source time diverge the moment anything is cut. If footage was already trimmed before you ran keep_ranges, your noted timestamps may point at the wrong material. Undo, place the full original clip fresh, and give the ranges again — or state them relative to what's on the timeline ("keep the first 35 seconds of what's there now").

You can still hear the camera audio underneath. If you added video to the timeline after the mute step, the new clips arrived unmuted. Say "mute the embedded audio on all video clips" — and add "including overlays" if you have picture-in-picture video, since overlay clips are skipped by default.

Compression made speech sound rushed. The default 1.25× cap is usually invisible, but slow, deliberate speakers show it sooner. Ask for a gentler cap ("compress to 60s but don't retime past 1.1× — trim the tail instead") or drop one range and keep everything at natural speed.

See also

Community