FAQ / Recording a long YouTube essay
How do I record a voiceover for a long YouTube video essay?
For a 20-to-60 minute essay, write the full script in chapters, record each chapter as its own VoiceOverAndOver project, drop the merged audio onto the edit timeline with chapter markers, and export captions in the same step.
Write in chapters, not as one wall
Structure the script in chapters of 3 to 8 minutes of finished audio each. Chapters are how YouTube's chapter UI works and they map cleanly onto separate recording sessions. Each chapter is its own VoiceOverAndOver project. Each paragraph inside that chapter is its own row.
Record one chapter per sitting
You will not record a 45-minute essay in one go and have it sound good. Pick one chapter per session. Warm up your voice for two minutes, record the chapter, listen back, re-record the rows that landed wrong. Save the project. Come back tomorrow for the next chapter.
Read ahead, look up
Your eyes should run one paragraph ahead of your voice. That is the trick that keeps your delivery sounding like a person talking, not a person reading. Glance away from the script after a sentence and look at the wall or the mic for a second. The natural breath that comes with looking up reads as "thinking", which is exactly the energy a video essay wants.
Long YouTube essays sound boring when every paragraph is the same length, the same pace, and the same energy. Vary your sentence lengths. Use deliberately short paragraphs as punctuation between long ones. The merge step preserves whatever rhythm you record.
Markers and captions for the editor
Tick "Premiere/Resolve markers" and "SRT" when you export. Drop both files into your editor with the audio. Every paragraph becomes a labeled marker on the timeline; every cue becomes a synced caption. You will spend the editing pass cutting b-roll to the markers, not hunting for the spot where you say a particular line.
One last polish pass
Before you upload, listen to the full merged audio at 1.25x speed. Anything that sounds rushed or unclear at 1.25x is where viewers will drop off at normal speed. Re-record those paragraphs.