What to Prepare Before Ordering Transcription — A Checklist

March 27, 2026 · 5 min read ·

Transcription ordered without preparation most often ends in one of three ways: the transcriber comes back with questions that cause delays; the result is technically correct but does not meet requirements; or a sensitive recording ends up somewhere it should not be.

Fifteen minutes of preparation prevents these situations. Below is a checklist for four areas that determine the outcome — applicable both for commissioning a human transcriber and for using an automatic tool.

Block 1 — Recording Preparation

Verify Audibility

Before sending, play the recording at three points: beginning, middle, and end. What to look for: is the speaker audible without excessive amplification? Is background noise approximately constant, or are there sudden disruptive sounds? Do two people talk simultaneously for more than brief passages?

If quality is notably low, notify the transcriber in advance. The transcription will proceed, but error rates will be higher and review will take longer.

File Format

The most universal formats are MP3 and WAV — all major transcription tools and agencies accept them. Stereo files work for transcription the same as mono but are 2x larger. Video files (MP4, MOV) are processed directly by most tools; verify with agencies or freelancers in advance.

Bitrate above 128 kbps for MP3 does not improve transcription accuracy — it unnecessarily increases file size and upload time.

Naming and Organisation

Consistent file naming saves hours of navigating transcripts: 2024-10-15_interview-smith.mp3 is significantly better than IMG_20241015_154233.m4a. Send multiple files as a ZIP or via shared storage — not as dozens of email attachments. If recordings follow a sequence, number them explicitly (01_, 02_, ...).

Block 2 — Specifying Requirements

Transcription Type

This is the most commonly overlooked specification and the one that causes the greatest mismatch between the order and the result:

Verbatim: every syllable exactly as spoken — filler words, false starts, incomplete sentences. Used for court protocols, phonetic research, or language corpora.
Edited: remove filler words and excessive repetitions, adjust punctuation for readability; preserve factual content. Suitable for the vast majority of business and research transcriptions.

Default assumptions differ among transcribers and tools — this choice must be specified.

Speakers

Provide the approximate number of speakers. If names should appear in the transcript, supply a list: Speaker 1 = John Smith, Speaker 2 = Jane Doe. If speakers are not identified, agree on codes (Speaker A, B) or on transcription without attribution.

Language and Foreign-Language Passages

Are there foreign-language passages in the recording? Should they be transcribed in the original language, translated, or marked as [in English] without transcription? Are technical terms in English preserved in the original or localised?

Output Format

Need	Format
------	--------
Plain text for further processing	TXT
Formatted document	DOCX (Word)
Subtitles for video	SRT or VTT
Machine processing or database	JSON with timestamps
Spreadsheet overview	CSV

Also specify: are timestamps needed? At what granularity — for each word, each sentence, or each speaker segment? How should paragraphs be divided?

Glossary

Supply a list of proper names, institution names, products, and technical terms. A plain text file with a word list is sufficient. The impact is direct: it reduces error rates for less frequent words and eliminates questions.

Block 3 — Data Protection

What the Recording Contains

A voice recording is personal data under GDPR. If the recording contains personal data, health information, trade secrets, or other sensitive information, the transcription method must correspond to the content sensitivity.

Cloud Tool — What to Verify

Three questions before sending a sensitive recording to a cloud tool:

Where are the servers? EU servers mean processing under GDPR. US servers require verification of the legal basis for data transfer.
How long does the tool retain files? Can they be deleted on request?
Are recordings used for model training? For commercial transcription tools, the answer is usually no — but verify this in the terms of service, do not assume.

Choosing a Method Based on Sensitivity

Low sensitivity (public lectures, non-confidential recordings): cloud transcription without special measures
Medium sensitivity (corporate meetings, research interviews): EU servers, verify retention conditions
High sensitivity (healthcare records, legal files, confidential business information): local transcription without sending files, or a human transcriber with an NDA

Block 4 — Reviewing the Delivered Transcript

Formal Compliance with the Order

Before diving into content review, check basic compliance: Does transcript length correspond to recording length? (Rough estimate: 1 minute of audio → approximately 100–150 words.) Is the output format correct? Are timestamps present if they were requested?

Six Targeted Review Checkpoints

Systematic review focusing on locations with the highest error probability:

Proper names and institution names — spelling, capitalisation
Numbers, dates, and abbreviations — compare with recording
Punctuation in compound sentences — verify commas before subordinate clauses and question marks
Speaker transitions — verify correct attribution of statements
Technical terms from the supplied glossary — targeted text search
Homophonic substitutions in critical passages — play back passages with factual claims

Feedback for Corrections

Specific feedback accelerates correction and improves its accuracy: "speaker name missing in passage 12:30–14:00" is far better than "the transcript is incomplete." For larger volumes, mark problematic passages in the document and supply as comments.

Quick Overview — Print Before Ordering

Area	Check
------	-------
Recording Preparation
Audibility verified at three points	☐
Format compatible with recipient	☐
Files named consistently	☐
Long recordings split	☐
Requirements
Transcription type specified (verbatim / edited)	☐
Number and names of speakers provided	☐
Foreign-language passages addressed	☐
Output format and timestamps specified	☐
Terminology glossary provided	☐
Data Protection
Content sensitivity assessed	☐
Server location verified (if relevant)	☐
Consent of recorded persons confirmed	☐
Result Review
Transcript length matches recording	☐
Format is correct	☐
Six review checkpoints completed	☐

Sources:

EUR-Lex: Regulation (EU) 2016/679 (GDPR) — https://eur-lex.europa.eu/eli/reg/2016/679/oj