language
language may refer to human (or natural) languages or computer (often programming) languages.
If you are looking for wiki pages in other languages, see:
- other-languages and the category Pages by language
Why marking up
Before you consider marking up your page with the appropriate language-tags, consider why you are marking up. Don't just mark up because you can markup.
Filtering
When marking up a h-entry of a post with a lang
attribute, you enable users of a reader to filter out a certain language they don't speak. Thus making it possible to follow a user only in a specific language you speak.
Pelle Wessman on chat: "on Twitter I often don't follow people that tweet too much in a language I don't understand and I hold back on tweeting in swedish because I know it might likewise annoy others"
Twitter does filter on language in Search, but not on the timeline.
Screen readers / text to speach
When someone uses a screen reader, the marked up language can be used to select the right pronunciation rules.
- This post by
Sebastiaan Andeweg is a Dutch transcription of English and would thus be best marked up as 'nl', to guide screen readers toward the right pronunciation.
Martijn van der Ven used to mark up his name with
lang="nl"
to guide screen readers towards the right pronunciation of his name.
Translations
Translation software can translate certain posts or texts if it knows the language.
- Most translation software can probably detect the language too?
How to mark up
You can specify the language of a HTML document, or a part of it, by using the lang="??"
attribute, where ??
is the language-code for your language. For English, this is en
, en-GB
or en-US
.
HTML also allows you to mark the language of the target of a hyperlink using the hreflang
attribute.
HTML 5 has also introduced a translate
attribute that allows you to specify that a piece of text ought to not be automatically translated.
- There are thoughts on how to parse
lang
in Microformats.
Language detection
- PHP: http://pear.php.net/package/Text_LanguageDetect/
- Python: https://pypi.python.org/pypi/langdetect/
- JS: https://github.com/wooorm/franc
Christian Weiske uses language detection to automatically create the <html lang="??"> attribute for blog posts from the post's title.
FAQ
Q: Why detect instead of adding manually?
- Less tedious, less prone to errors
Q: Why detect yourself, if others can detect too?
- Because sometimes they don't, but do things with the lang-attribute.
- Detect once while publishing vs. detect again and again and again and again
See Also
- programming language
- This Wiki in other languages
- https://www.w3.org/TR/html401/struct/dirlang.html
- i18n
- multilingual blogging
- https://codepen.io/tigt/post/notes-on-lang - Lots of information about the
lang
attribute and its values, language codes.