Build a UGC ad from product footage

Tutorial: turn raw product clips into a ~35-second direct-response UGC ad with a benefit-led hook and CTA — then save it as a recipe and batch it.


You have a folder of raw product clips — someone talking to camera about the product, a few close-ups, maybe a screen capture. This tutorial turns that into a ~35-second, portrait, direct-response UGC ad with a hook, captions, B-roll, and a call to action, in about 10 minutes. Then you save the whole edit as a recipe and re-run it on the next product's footage in one click.

What you'll make

A ~35-second vertical ad with:

  • A benefit-led hook quoting the speaker's actual words
  • Product-focused vertical framing
  • Product B-roll cut in from your own unused clips
  • boxed-contrast captions and a clean, product-friendly grade (the ugc-ad style pack)
  • A CTA overlay near the end — "Try it today" by default, or your own text
  • A balanced voice/music mix

Before you start

  • Import everything into the Media panel: the talking-to-camera clip plus every product close-up and detail shot you have. Clips you don't place on the timeline still matter — the agent uses unplaced bin assets as B-roll.
  • Give assets recognizable file names (bottle-closeup.mp4 beats IMG_4021.mp4) — B-roll matching uses asset names and the transcript.

Step 1: One prompt

Make a UGC ad from this footage — about 35 seconds, portrait. Open on the strongest benefit she actually says, use my product close-ups as B-roll, and end with a "Start your free trial" CTA.

If you're happy with the default CTA text ("Try it today"), leave that part out.

What the agent actually does

  1. Storyboard. It states the ad's thesis — the one benefit the ad argues — and posts a storyboard card: hook, proof, payoff, CTA beats with source ranges. See Storyboard.
  2. Applies the ugc-ad creator template. This is the direct-response workflow in one move: portrait canvas, the ugc-ad style pack's clean product-friendly grade, the boxed-contrast caption skin (applied once captions exist), a ~35s target, fast pacing, and a CTA beat that defaults to "Try it today" a few seconds before the end. See Creator templates.
  3. Assembles and frames. It places your clips, flattens stray overlays, and runs smart_reframe_subject with focus: product — product shots get safer, lower framing so the product isn't cropped out of a 9:16 frame.
  4. Cuts in B-roll. insert_broll_from_assets with goal: product matches your unused bin assets against the transcript and overlays them at the moments the product is being described — by default up to 4 placements of ~2.5 seconds each. Your close-ups land when the speaker says the thing they show.
  5. Writes the hook from the transcript. The hook overlay quotes or tightly paraphrases the speaker's best benefit line ("I stopped losing receipts the first week"). The agent is explicitly forbidden from stock hooks like "You won't believe this" — if the footage doesn't contain a hook, it won't invent one.
  6. Captions and sound. Captions timed to speech in the boxed-contrast skin, then auto_sound_design: voice at 0 dB, music bed ducked to -18 dB, 0.25s fades.
  7. Self-checks. critique_editverifyreview_edit (an AI watch-through of the rendered ad) before it's allowed to finish. See How the agent checks its work.

What the timeline looks like after

Main track: the talking segments, tightly cut. Overlay track: 2–4 short product B-roll clips and the hook/CTA text. Caption track above. Portrait canvas, music bed running the full length.

Step 2: Refine conversationally

Different hook line:

Open on the line where she says "it pays for itself in a week" instead.

Change the CTA:

Change the CTA to "Link in bio — 20% off this week" and give it the last 4 seconds.

More product, less face:

Add one more product B-roll shot around the 20-second mark, over the sentence about the strap.

Pace:

This drags in the middle. Cut it to 30 seconds — keep the hook and the CTA untouched.

Step 3: Save it as a recipe and batch the next product

This is where the workflow pays off for anyone shipping ads weekly. A recipe is the saved sequence of tool calls — with their exact parameters — that produced this edit.

  1. In the agent panel, save the run as a recipe (name it something like ugc-ad-v1).
  2. Open a new project with the next product's footage.
  3. Apply the recipe. The same edit sequence — template, product reframe, B-roll matching, captions, sound design, checks — replays against the new footage.

Recipes are global across projects. Iterate on one ad until the structure converts, then stamp it onto every product in the catalog. Details: Saving and replaying recipes.

Make it yours

  • Founder-voice variant. "Use the warm-founder style pack instead" swaps the grade for a warmer, less produced look.
  • No hook overlay. Some ad accounts prefer a cold open: "Skip the hook text — let the first line carry it."
  • Stock B-roll gap-filler. Missing a lifestyle shot? "Pull one stock clip of someone unpacking a delivery box and use it as B-roll at the top." The agent imports stock to the bin and places it — see Stock media.
  • Multiple lengths. After the 35s master: "Give me a 15-second cutdown that keeps only the hook and the CTA."

Troubleshooting

The B-roll landed at the wrong moments. B-roll placement matches asset names against transcript beats — if your files are named IMG_4021.mp4, the match is guesswork. Rename the assets in the media panel, or place them explicitly: "Put bottle-closeup at 0:12 and pour-shot at 0:21, 2.5 seconds each."

No B-roll appeared at all. insert_broll_from_assets only uses assets already imported to your bin — it doesn't generate or fetch anything on its own. Check that your close-ups are imported and not already placed on the timeline, or ask the agent to pull stock into the bin instead.

The hook feels generic. That usually means the footage never states a concrete benefit, so the agent had little to quote. Tell it exactly which line to use — or re-record a take where the speaker says the benefit in one clean sentence. A quoted hook always beats a written one.

The product gets cropped in vertical. Product reframing is heuristic, not pixel tracking. Say "the bottle is cut off at 0:14 — reframe that clip lower," or select the clip and adjust position manually in the properties panel.

See also

Community