1. Purpose and Scope
These guidelines establish Prime Video’s quality standards and best practices for subtitling across all content types and supported languages. They provide partners, including subtitling vendors, translators, and quality control specialists, with a unified framework for delivering high-quality subtitles that preserve creative intent while meeting the expectations of global audiences.
Prime Video also maintains comprehensive language-specific Timed Text Localization and Technical Style Guides for over 30 priority territories, with detailed formatting conventions, capitalization rules, punctuation specifications, and technical requirements tailored to individual languages. These documents can be downloaded from the Languages, localization, and genres section of Slate. The present Subtitling Guidelines distill the universal principles embedded across all language-specific style guides and elaborate on the foundational tenets that underpin subtitling quality at Prime Video. Together, these resources establish a conceptual framework that helps localization teams understand both what Prime Video requires and why these requirements matter for creating subtitled content that transcends language barriers and delivers immersive viewing experiences to global audiences.
As a general rule, subtitling teams should consult Prime Video’s Timed Text Localization and Technical Style Guides for detailed instructions on the language they’re localizing into, while integrating into their work the broader principles outlined in these Subtitling Guidelines.
2. Core Quality Principles
Over the past decades, subtitling has matured from a purely technical practice into a discipline supported by rigorous academic research. Scholars such as Díaz Cintas and Remael (2007), Chaume (2004), Titford (1982), Georgakopoulou (2009), and Pedersen (2011) have developed comprehensive frameworks for understanding quality in subtitling, while broadcast and streaming industries have established professional standards that reflect both technological capabilities and audience expectations. Prime Video draws on this body of work to organize subtitling quality through two interconnected dimensions:
Translation and Adaptation focuses on creating subtitles that convey meaning accurately and authentically while respecting the constraints inherent to the medium. This includes maintaining conciseness and fidelity, ensuring readability through reduction and simplification, achieving consistency in terminology, avoiding overly literal translations, and adapting cultural references appropriately. Subtitlers must balance respect for the source material with creating natural dialogue in the target language that audiences can process within the constraints of on-screen text.
Technical Requirements establishes standards for subtitle presentation that allow subtitled content to integrate seamlessly into the viewing experience. This includes synchronization and timing specifications, reading speed parameters, subtitle duration, line treatment and positioning guidelines, and line break conventions. These technical elements ensure that subtitles are readable, accessible, and unobtrusive.
These two dimensions work together to ensure subtitles serve their fundamental purpose: enabling viewers to fully engage with content originally produced in another language.
3. Translation and Adaptation
Subtitling presents a unique translation challenge: transforming spoken dialogue into written text that viewers can read, process, and comprehend within strict spatial and temporal constraints. Unlike other translation forms, subtitlers work within a medium where screen space is limited, reading time is fixed by the pace of audiovisual content, and viewers must simultaneously process visual information, audio elements, and written text. This environment demands translation strategies that prioritize conciseness and readability while preserving semantic accuracy, narrative coherence, and emotional impact.
As such, a critical requirement is that creation and quality check of subtitled content must always be performed alongside watching the video. This ensures that timing, context, and visual information can be properly assessed, as well as other performance elements that inform translation decisions.
3.1 Conciseness and Fidelity
The subtitling environment imposes unique constraints on translation work. As noted by Titford, both spatial and temporal limitations fundamentally shape every linguistic decision subtitlers make. Unlike other forms of translation, subtitlers must balance the need for semantic accuracy against the physical reality of screen space and the temporal flow of audiovisual content. This constraint is not a limitation to overcome but rather a defining characteristic of the medium that requires strategic approaches to text reduction and adaptation.
The subtitler’s main challenge lies in determining what information is essential to preserve and what can be condensed or omitted without compromising plot comprehension or emotional impact. This requires continuous judgment about relevance and narrative value: understanding which elements drive the story forward and which serve supplementary purposes.
To accomplish this goal, subtitlers rely heavily on reduction strategies. Georgakopoulou’s research identifies two main types: partial reduction or condensation, which relies on more concise rendering of the original; and total reduction or deletion, achieved through omission of part of the source message. Both approaches serve the same purpose: ensuring that viewers can comfortably read and process subtitle content within the time available while retaining all information necessary to understand the plot.
For instance, spoken dialogue frequently employs words like “you know,” “well,” “okay,” and “like,” along with the habit of beginning sentences with “and”. When these words function as fillers rather than meaningful content, they should be omitted from subtitles. Similarly, hesitation markers like “um,” “uh,” “er,” and “ah” should be avoided in translation whenever possible, as they add reading burden without contributing semantic value. These omissions are a practical application of the conciseness principle: removing text that does not serve narrative understanding.
3.2 Consistency and Coherence
Consistency serves both comprehension and coherence. Subtitlers must maintain consistent treatment of terminology, character names, technical vocabulary, and recurring phrases throughout the content to help viewers build understanding cumulatively, rather than forcing them to reconcile different translations of the same concept.
Subtitles should also be structured to be “semantically and syntactically self-contained”, as noted by Díaz Cintas and Remael. This means that each subtitle unit should make sense on its own, with words intimately connected by logic, semantics, or grammar grouped together whenever possible. This ensures viewers can process each subtitle as a coherent unit during its brief on-screen appearance, instead of struggling to understand fragmented information across multiple subtitle events.
3.3 Cultural Adaptation
Scholars such as Pedersen have identified “extralinguistic cultural references” as particularly challenging areas for translation. These lexical items, which reference people, gastronomy, customs, places, and organizations, are deeply embedded in the source culture and may be completely unfamiliar to target viewers. Subtitlers must determine whether to preserve, adapt, or replace them based on their narrative function and the likelihood that they will be recognized and understood.
Most of these cultural references, including slogans, puns, popular quotes, and idiomatic expressions, require transcreation to ensure similar emotional responses are evoked in the target audience. Rather than translating literally, subtitlers should adapt these elements based on their function and intended impact in the source language to achieve the appropriate resonance in the target culture.
This creative freedom, however, comes with important legal boundaries. Subtitlers must never copy official translations: all direct quotes from poems, books, and other published works must be translated or created from scratch, except for citations from the Bible or other works not covered by copyright. This ensures that Prime Video’s subtitle content respects intellectual property rights while maintaining translation quality.
3.4 Readability and Natural Expression
In fictional content such as films and TV shows, dialogue attempts to sound natural by mimicking everyday conversation, but it has been carefully crafted by script writers: a phenomenon that Chaume calls “prefabricated orality”. Subtitlers must transform this scripted speech into written text that reads naturally and processes quickly.
When readability requires it, subtitlers should condense, substitute, or paraphrase long and complex phrases with shorter and simpler alternatives. However, simplification must never come at the cost of semantic completeness. All necessary meaning elements must be preserved, along with the original tone and register of the source dialogue.
As previously mentioned, literal word-for-word translation often fails to convey intended meaning, cultural nuances, or idiomatic expressions accurately. Subtitlers should use established or standard target-language expressions, particularly for idioms and metaphors, colloquialisms and slang, and proverbs and sayings. For puns and wordplay, an equivalent effect should be recreated in the target language rather than explaining the source-language joke.
3.5 Tone and Register
The tone of voice in subtitles must reflect the intent and performance of the source version. Subtitlers should carefully analyze the speaker’s delivery, context, and relationship dynamics to accurately convey the intended emotional register and social positioning in the target language.
This includes preserving formality levels and social register, i.e. the distinctions between formal, informal, and intimate speech patterns that signal power dynamics, social hierarchies, or interpersonal relationships. It also requires maintaining emotional intensity and affect, ensuring that the degree of emotion relayed through word choice, syntax, and rhetorical devices renders anger, joy, sarcasm, or tenderness with equivalent impact. Additionally, subtitlers must capture character-specific speech patterns and idiolects, including individual linguistic quirks, verbal habits, catchphrases, or distinctive vocabulary that define a character’s voice and contribute to characterization.
When dialects are used in the original, subtitlers should maintain them in translation when possible, finding equivalent regional or social dialects in the target language that convey similar cultural and social information. The goal is not phonetic transcription but rather the preservation of the sociolinguistic function the dialect serves in the narrative, whether signaling geographic origin, social class, education level, or group membership. Where direct dialect equivalents do not exist or would create confusion, subtitlers may employ other linguistic strategies, such as lexical choices, syntax variations, or register shifts, to approximate the social and cultural markers conveyed by the original dialect.
4. Technical Requirements
Technical requirements establish the standards for presentation that allow subtitled content to integrate seamlessly into the viewing experience. These specifications encompass synchronization and timing, reading speed parameters, duration, line treatment, and positioning conventions.
4.1 Synchronization
Subtitle timing directly affects how viewers perceive quality. Díaz Cintas and Remael argue that synchronization is possibly the main factor that influences viewers’ appreciation of subtitled content. When subtitles appear precisely as characters begin speaking and disappear when they finish, viewers can easily identify who is saying what. On the other hand, when subtitles appear too early or too late, or remain on screen long after the dialogue ends, the viewing experience can be heavily disrupted. Subtitles should be timed to the audio within 3 frames, to mirror the rhythm of the content and the delivery of the speakers. This synchronization process, known as spotting, cueing, timing or originating, may be carried out by translators or by experts familiar with timing software, techniques and specifications. The spotting must remain mindful of pauses, interruptions, and other prosodic features that characterize the original speech.
For dialogue that crosses shot changes, specific frame-accurate timing conventions apply. If dialogue starts within 3 frames of a shot change, the in-time should be adjusted to the shot change. If dialogue ends within 3 frames of a shot change, the out-time should be pulled back to 2 frames before the shot change. If there is one subtitle before and one subtitle after the shot change, the first subtitle should end 2 frames before the shot change, and the second subtitle should start on the shot change.
Timing conventions may vary across languages. In Japanese, for instance, when dialogue ends within 3 frames of a shot change and there is no subtitle immediately after, the out-time should be pulled back to the shot change itself rather than 2 frames before. As always, subtitlers should consult Prime Video’s timed text style guides to ensure adherence to language-specific timing requirements.
4.2 Reading Speed
The second key constraint that impacts the quantity of text that may be included in a subtitle event is the assumed reading speed of the audience. Establishing appropriate reading speeds presents challenges because audiences vary widely in reading ability, and comprehension depends on vocabulary complexity, syntax, and on-screen action.
Since no single reading speed suits all viewers, the industry has developed multiple approaches. Broadcast television traditionally relies on what Díaz Cintas and Remael describe as the “six-second rule": two full lines of approximately 35 characters each (70 characters total) should be readable within six seconds. This approach assumes that two frames of audiovisual content allow for one character of subtitle space, yielding a reading speed of 12 characters per second (cps) or approximately 130 words per minute (wpm).
While this formula remains common in broadcast, many streaming platforms have adopted faster reading speeds, arguing that contemporary viewers are more used to reading on-screen text than previous generations and can therefore process subtitles more quickly. Consequently, reading speeds of 15 cps (160 wpm) have become fairly standard, with some platforms allowing 17 cps (180 wpm) or even higher reading speeds for certain content types.
Prime Video establishes different standard reading speeds based on content type: adult programs should aim for a maximum of 17 characters per second, while children’s programs should aim for a maximum of 13 characters per second. When the maximum reading speed of 17 characters per second cannot be achieved, certain adjustments may be made. If the text cannot be condensed and additional time is needed due to reading speed requirements, the out-time can be extended by up to half a second (12 frames) past the end of the audio, as long as it does not cause the subtitle event to cross a shot change. If this is not possible, the subtitle should be condensed without altering or losing the intended meaning of the source. Alternatively, subtitle events can be merged or split to help with reading speed. As a last resort, the reading speed may be increased up to 22 characters per second.
Reading speed standards vary considerably across languages, reflecting differences in script density, character complexity, and audience reading habits. For example, Japanese uses 4 cps for both adult and children’s programs, whereas Indian languages use 22 cps for adults and 18 cps for children. Once again, Prime Video’s language-specific style guides should always be referenced for the reading speed requirements applicable to each language.
4.3 Duration
Subtitle duration directly affects readability and viewer comfort. Although the time a subtitle remains on screen depends ultimately on the speed at which dialogue is delivered, establishing minimum and maximum duration limits ensures consistent viewing quality.
To avoid flashing subtitles on screen and ensure viewers have sufficient time to read content, the minimum duration should be approximately five-sixths of a second per subtitle event. This translates to 20 frames for 24 fps content, 21 frames for 25 fps, and 25 frames for 30 fps. On the other hand, subtitles should not remain on screen longer than needed, as otherwise there is a risk that viewers will start re-reading the text. To avoid this, the maximum duration should be 7 seconds per subtitle event. Therefore, when spotting content, periods longer than seven seconds should be split into smaller units.
Between continuous subtitle events, a minimum gap of 2 frames should be maintained. This brief interval ensures that viewers can distinguish between separate subtitle units and prevents the perception of flashing text.
In fast-paced dialogue where multiple speakers interrupt each other without breaks, Japanese subtitles allow duration to be less than the standard minimum. However, it should never be less than 10 frames to avoid flashing. When this constraint cannot be met, linguists must prioritize which dialogue to subtitle, as Japanese subtitles do not support dual speakers.
4.4 Line Treatment and Positioning
The principle of semantic and syntactic self-containment discussed earlier translates into specific technical requirements for subtitle presentation. Both the division of dialogue across multiple subtitle events and the line breaks within individual subtitles should ideally match a logical and grammatical break in the dialogue. This means that words that are closely connected by logic, semantics, or grammar should be clustered together, avoiding splits that separate articles from nouns, adjectives from the words they modify, or verbs from their subjects. This helps viewers process subtitles coherently and avoids information from being fragmented across multiple subtitle events.
Subtitles should also be limited to a maximum of 2 lines per event, with no more than 42 characters per line, and they should always be kept to a single line unless a break is needed for clarification or style.
In terms of positioning, subtitles should be center-justified and placed at the bottom of the screen. If there is on-screen text (such as credits or forced narratives) in the lower third of the screen, subtitles should be moved to the top. In cases where overlapping the on-screen text is impossible to avoid, the option that causes the least disruption to the viewer should be chosen. When both on-screen text and the speaker’s face cannot be avoided, the preference is to avoid covering the on-screen text.
Line treatment conventions vary by language, including specific rules for line division, compound nouns, articles, adjectives, and preferred line length ratios. For instance, Japanese uses 13 characters per line for horizontal subtitles and 11 characters per line for vertical subtitles, whereas Thai uses 37 characters per line. Subtitlers should refer to Prime Video’s language-specific style guides for additional details.
5. Specialized Content
While the Translation and Adaptation and Technical Requirements sections establish the foundational principles that apply to all subtitle work, certain specialized content presents unique challenges that require nuanced treatment drawing on both dimensions. This section provides guidance for handling these specific elements to ensure quality standards are maintained consistently across all subtitle work.
5.1 Titles
When the main title of a TV series or movie appears on screen, subtitling requirements depend on whether the content is a new release or a catalog title. For new releases, the main title should not be subtitled unless otherwise instructed.
For catalog or library titles, subtitling requirements vary by language. When subtitling is required, the main title should be omitted if the source version fully matches the approved translation provided by Prime Video for the target language. When translating the main title from scratch, any localized versions of existing Intellectual Property (IP) should be retained. Additional direction on the localization treatment is determined by campaign strategy. Main title translation approaches vary considerably across languages, so Prime Video’s timed text style guides should always be consulted for language-specific instructions.
When season titles appear on screen, they should only be translated and subtitled if they include numbering (e.g., “Season 3") or differ from the series main title. Episode titles should always be translated and subtitled when they are featured on-screen, and can be localized directly without restriction, but consistency should always be maintained between the metadata and subtitle assets.
5.2 Forced Narrative and On-Screen Text
Subtitles should be provided for all plot-pertinent on-screen text, including narrative text (which is part of principal photography) and burn-in text (which has been added in post-production). If the camera focuses on a set element with intent, this is an indication that it is pertinent to the plot, and a narrative subtitle should be provided. On the other hand, if the forced narrative is identical to the on-screen text, covered in dialogue, or featured repeatedly throughout the content, it should be omitted to avoid redundancy. If the on-screen text and its translation differ only in accent marks (e.g., “Berlin” vs. “Berlín”), the linguist may decide whether to include the subtitle or treat it as redundant based on whether the accent difference is significant enough to warrant display.
As for positioning, forced narratives should be placed so that they do not cover the source-language on-screen text or the speaker’s face. If both are unavoidable, the preference is to avoid covering the on-screen text.
If forced narratives for on-screen text interrupt dialogue, specific formatting conventions apply to maintain continuity and readability. The standard treatment is to use an ellipsis at the end of the preceding subtitle event and at the beginning of the following subtitle event to indicate the interruption. This helps viewers understand that the dialogue continues after the forced narrative appears. Additionally, forced narratives should never be combined with subtitled dialogue in the same event: they must always appear separately to maintain clarity and readability.
Treatment of ellipsis when forced narrative interrupts dialogue varies by language. For example, Czech does not use ellipsis in this context, and Japanese uses vertical positioning with asynchronous timing instead of ellipsis to avoid disrupting the dialogue subtitle. Subtitlers should refer to Prime Video’s language-specific style guides for more details.
5.3 Names
The treatment of proper names varies significantly by language. Some languages transliterate proper names into their native script, while others preserve the original spelling, including diacritical marks and accents.
When proper names need to be localized for creative reasons, translators should ensure all subtitle assets are consistent with the dubbed audio of their corresponding languages, if applicable. Nicknames should only be translated if they convey a specific meaning or if they are well-known and have recognized equivalents in the target language. For historical or mythical character names, subtitlers should always resort to established or well-known translations.
For brand names, the official localized version should be used. If unavailable, the brand name should be left in English or transliterated in applicable languages. However, if a brand is unfamiliar in the target territory, the product should be described using a generic term instead of the brand name to ensure viewers understand the meaning. For fictional brands, localization should be achieved by translating or transliterating the content, depending on creative intent.
5.4 Foreign Dialogue
For any dialogue that is spoken in a language different from the original audio, translation should be provided if the content is meant to be understood by the audience. In these cases, the foreign dialogue will typically be subtitled in the original version as well.
Foreign words, expressions, or phrases should be translated or transliterated if there is no established translation. Spelling and grammar should always be verified, and foreign words should be italicized unless they have become part of normal usage in the target language.
5.5 Profanity
Profanity and taboo language require careful handling in subtitles. Díaz Cintas and Remael note that emotionally charged language is particularly sensitive when migrating from oral to written form, as its impact is believed to be stronger when written than verbalized. Despite this perception, subtitlers must communicate the equivalent intensity appropriate to the target culture and relay the essence of the source content without censoring or toning down, unless the audio itself is muted or bleeped. Tolerance levels for profanity vary significantly across cultures, so subtitlers should be mindful of the culture they are localizing into and provide a viewing experience that is comparable to the original. Depending on the culture, it may be appropriate to adjust profanity or terminology that may be inflammatory in a particular region.
When the audio is censored or bleeped, subtitle treatment varies by language, with methods including representing censored expletives with asterisks, ellipses, or special characters. For example, Finnish and Spanish use the first letter followed by asterisks, Thai replaces every letter with asterisks, Italian uses first and last letters with asterisks in between, and Japanese uses special characters. Prime Video’s language-specific style guides should be checked for additional details.
5.6 Songs and Music
All songs and music instances require rights clearance before translation. Content providers or licensors are responsible for securing this clearance prior to localization and sharing it with the relevant stakeholders. For subtitling, songs should only be included if they are deemed plot pertinent. However, this does not merely mean thematically relevant: the lyrics must convey information that is absolutely necessary for viewers’ understanding of the plot.
In the case of opening and closing theme songs, these should only be subtitled when clearly plot-pertinent, such as in children’s content where the lyrics tell a story. Generally, opening and closing theme songs should not be subtitled in content targeted at adult audiences, except for Subtitles for the Deaf and Hard of Hearing (SDH). If a plot-pertinent song contains lyrics that have been altered or parodied for comedic effect, they should be localized to preserve this effect in the target language.
When subtitling song lyrics, specific formatting conventions apply to ensure consistency and readability. Lyrics should be italicized to distinguish them from regular dialogue. Song titles should be enclosed with quotation marks, while album titles should be italicized. Capitalization and punctuation for lyrics should follow regular rules, with each lyric line starting with an uppercase letter. Only question marks, exclamation marks, or ellipses should be used at the end of lyric lines, though commas may be used within lines when necessary.
Formatting requirements for song lyrics vary significantly by language. For example, many Romance languages enclose lyrics with music note symbols at the beginning and end of each subtitle, separated from the text by a space. Japanese uses no punctuation for songs and encloses song titles with double-byte curly quotation marks. Chinese languages do not italicize lyrics and enclose both song titles and album titles with guillemets. Arabic encloses song lyrics, song titles, and album titles all with double straight quotation marks. Swedish italicizes both song lyrics and song titles. Each language has specific conventions for quotation mark styles, capitalization rules, and punctuation treatment. As always, subtitlers should consult Prime Video’s language-specific style guides to ensure compliance with relevant requirements.
5.7 Translator Credit
The translator credit should be included when available as the last event of the subtitle asset, with a duration of approximately 2-3 seconds and using the language-specific format for this type of credit. If more than one translator has worked on the same asset, both can be credited. Company and creative supervisor credits may also be included if applicable, but company credits should never replace translator credits. For SDH, credits should not be included for tasks that only involve the transcription of original or dubbed audio.
6. Conclusion
These guidelines represent Prime Video’s commitment to subtitling excellence across all dimensions of the localization workflow. By establishing a unified framework grounded in both academic research and industry best practices, they provide partners with the conceptual foundation necessary to deliver subtitles that honor creative intent while serving diverse global audiences.
The integration of universal principles with language-specific technical specifications also reflects the dual nature of subtitling work: a discipline that demands both creative judgment and technical precision. As subtitlers navigate the constraints inherent to the medium –balancing reduction with semantic completeness, adapting cultural references while preserving narrative coherence, and respecting technical requirements while ensuring readability– these guidelines should help them create a viewing experience that feels natural and unobtrusive.
As the localization landscape continues to evolve, these principles will guide our ongoing commitment to quality and innovation, ensuring that every subtitled production honors the creative vision of the original while making content accessible across linguistic and cultural barriers.
7. References
Chaume, F. (2004). Cine y traducción. Madrid: Cátedra.
Díaz Cintas, J. and Remael, A. (2007). Audiovisual Translation: Subtitling. Manchester: St Jerome.
Georgakopoulou, P. (2009). “Subtitling for the DVD industry”, in Jorge Díaz Cintas and Gunilla Anderman (eds) Audiovisual Translation: Language Transfer on Screen. Basingstoke: Palgrave Macmillan, 21-35.
Pedersen, J. (2011). Subtitling Norms for Television: An Exploration Focussing on Extralinguistic Cultural References. Amsterdam and Philadelphia: John Benjamins.
Titford, C. (1982). “Subtitling-Constrained Translation”. Lebende Sprachen 27 (3): 113-116.