Synopsis

  1. Use Unicode encoding whenever possible.
  2. Use the LANG tag to mark words or passages of text in another language. This works for major languages only.
  3. Consider supplementing language changes with a textual indication (visual or hidden) to indicate when a foreign language word or passage is coming.

About Language Tagging

The LANG tag (i.e. the lang="" attribute) is designed to signal screen readers pronunciation engines to switch to another language. For this reason and other, tagging Web text as being in a particular language is required in WCAG 2.0.

WCAG 2.0 Guideline 3.1.1—"The default human language of each Web page can be programmatically determined. "

Even more critical is to use language tagging to signal a switch in languages.

WCAG 2.0 Guideline 3.1.2—"The human language of each passage or phrase in the content can be programmatically determined except for proper names, technical terms, words of indeterminate language, and words or phrases that have become part of the vernacular of the immediately surrounding text. "

Declaring Page Language

The LANG attribute is designed to signal screen readers to switch to another language. The official W3C recommendation is to declare the primary language for each Web page with a <...lang => attribute in the <html> tag. Codes are ISO-639 Language codes, some of which are listed further down on this page.

NOTE: You must also declare the encoding in
addition to the language. The language and its script are independent.

Declaring a U.S. English Page (Penn State)

<html lang="en-US"> ... </html>

 

Declaring a British English Page

<html lang="en-GB"> ... </html>

Screen readers supporting this tag could switch to a British accent.

 

Declaring a French Page

<html lang="fr"> ... </html>

Screen readers supporting this tag could switch to a French accent.

Switching Languages

If you switch languages within one page, you can embed the LANG attribute
in other tags such as a P, TD, SPAN, DIV and
other tags. For example

Test Text with Lang Tags

This sentence is in default American English.

This sentence will be read with a British accent.

Esta frase es en español. (Spanish)

Cette phrase est en français. (French)

Mae’r frawddeg hon yng Nghymraeg. (Welsh – Not Supported)

View the Code

<p> This sentence is in English. </p> <p lang="en-GB"> This sentence will be read with a British accent </p> <p lang="es"> Esta frase es en espa&ntilde;ol. </p> (Spanish) <p lang="fr"> Cette phrase est en fran&ccedil;ais </p> (French) <p lang="cy"> Mae’r frawddeg hon yng Nghymraeg. </p> (Welsh not always supported) </p>

Common Language Codes

Two Letter vs. Three Letter

The first set of language codes (ISO-639) were two letter codes, but did not cover every language. As a result, sets of three-letter codes (ISO-639-2/ISO-639-3) were created to cover more languages.

For any language, standards indicate to use the two-letter code if it exists. Only use a three-letter code if no other code is available. See the ISO 639 Code Tables for a complete list of language codes including the original ISO-639 codes and later variants.

Western European Languages

These codes are supported in many screenreaders, including JAWS.

Language Code Variants
English

en

  • American English – Code: en-US
  • British English – Code: en-GB
Spanish es
French fr
Italian it No Major Variations. See Italian Page for dialect codes
Portuguese pt
  • Brazilian Portuguese – Code: pt-BR
  • European Portuguese – Code: pt-PT
  • See Portuguese page

Non-Western European Languages

Language Code Variants
Arabic

ar

See Arabic information
Chinese zh
  • Simplified Chinese – Code: zh-CN
  • Traditional Chinese – Code: zh-TW
  • Hong Kong – Code: zh-HK
  • Other Chinese variants
Hebrew he No Major Variants
Hindi hi No Major Variants
Japanese ja No Major Variants
Korean ko No Major Variants
Swahili sw No Major Variants

Ancient Languages

Language Code Variants
Ancient Greek

grc

  • Modern Greek: el
Latin la No Major Variants
Old English ang No Major Variants
Middle English enm No Major Variants

Supplemental Signals of Non-English Content

In addition to using the LANG tag, you can also include an indication in the text so that users of older screen readers can manually with languages. This can be done by spelling out the beginning/end of a passage in the text (preferably in an H1,H2 tag or as part of a set of links) or in the alt tag of an invisible graphic.

 

Spelling Out Language Name in Text

Translations of U.N. Universal Declaration of Human Rights

Spanish | French … (Menu provides quick
list of where non-English passages are). It is still recommended that the LANG tag also be used.

Spanish Article 1 (Spelling it Out)

Artículo 1
Todos los seres humanos nacen libres e iguales en dignidad y derechos y, dotados como están de razón y conciencia, deben comportarse fraternalmente los unos con los otros.

French Article 1

Article premier
Tous les êtres humains naissent libres et égaux en dignité et en droits. Ils sont doués de raison et de conscience et doivent agir les uns envers les autres dans un esprit de fraternité.

 

With Invisible Graphics

An older technique would be to add an invisible graphic and use the ALT text to signal the switch to another language. It is still recommended that the LANG tag also be used.

View the Code

<img src="transpixel.gif" alt="Begin Spanish">
<div lang="es">...

Top of Page