Captioning Standards

On Feb 20, 2014 the FCC released a new ruling stipulating the post-production captioning be 99% accurate in order for viewers who rely on captions to “have a comparable viewing experience to those who can hear the audio portion of the programming.” To that end, the FCC addresses four components of captioning that lead to that 99% goal: accuracy, synchronicity, completeness, and placement.

To be considered accurate, captions must:

Match the spoken words in the dialogue or song lyrics, in their original language (English or Spanish), to the fullest extent possible.

Contain all words in the order spoken, without paraphrasing or substitutions.

Contain proper spelling, punctuation, capitalization, tense and proper representation of numbers.

Convey the manner and tone of the speaker’s voice.

Include utterances (e.g., “um”) and false starts.

Mirror the dialog’s intentional use of slang and errors.

Provide any nonverbal information that is not observable, such as the existence of music (even without lyrics), sound effects, and audience reaction.

When there is more than one speaker, indicate who is speaking at any given time.

When a speaker is not on the screen, identify that individual to discern the speaker’s identity.

Be legible, with appropriate spacing between words to allow for readability.

To have appropriate synchronicity, captions must:

Coincide with their corresponding spoken words and sounds to the greatest extent possible.

Begin to appear at the time that the corresponding speech or sounds begin and end when the speech or sounds end.

Be displayed on the screen at a speed that can be read by viewers.

Be reformatted if necessary whenever rebroadcast programming is edited.

To be considered complete, captions must:

Run from the beginning to the end of the program, to the fullest extent possible.

For proper placement, captions must:

Not block other important visual content on the screen including, character faces, featured text, graphics or other information that is essential to understanding a program’s content.

Have font sized appropriate for legibility.

Be adequately positioned so as to not run off the edge of the video screen.

In addition, creators of video content suggest some additional guidelines* to make captions more accessible:

Font. Sans Serif Fonts, as compared to the serif font, are best. Some good options would be:

  • Arial
  • Calibri
  • Helvetica
  • Tahoma
  • Verdana

Color. A white color fill font with a black outline can be seen against any color.

Characters Per Line. Each caption frame should hold one to three lines of text at a time, and each line should not exceed 32 characters.


  • The minimum duration of a caption or subtitle frame is 1 second.
  • Each caption frame should be replaced by another caption frame, unless there is a long period of silence. So if someone stops speaking and 15 seconds of silence follow, you don’t want the caption frame to hang on the screen during the silence. It is unnecessary and makes it seem as though the speaker was talking longer than he or she was.

More information on Captioning

What is the Best Color for Text Captions in Photos and Videos.

 The Essential Higher Ed Closed Captioning Guide.

Information on Web Accessibility

Guidelines and other standards for web accessibility from W3C:

Ten Tips for Creating Accessible Course content from 3MediaPlay:

Creating Accessible Videos from the University of  Washington:

Accessibility and Usability on the Web from MIT:

The RIT letter logo in white

National Technical Institute for the Deaf
52 Lomb Memorial Drive
Rochester, NY 14623

NSF logo

This material is based on work supported by the National Science Foundation under Award Numbers 1104229, 1501756, and 1902474.