Language Support in Listen

For all content sources other than Twitter, Listen uses the Compact Language Detector (CLD) system of language detection. The CLD powers the language detection feature in Google Chrome and Translate. It has become an industry standard with support for 83 languages and improved accuracy. For Tweets, Listen utilizes Twitter's own language identification.

The support that a language receives will range from a more basic level, to full support including sentiment analysis and topic extraction. Below, you can find a full list of languages which currently benefit from full support:

Dutch Italian Spanish Chinese Arabic
English Polish Turkish Hindi Farsi/Persian
French Portuguese Danish Indonesian Hebrew
German Romanian Norwegian Japanese Thai
Greek Russian Swedish Korean Tagalog

List of Supported Languages

The following table shows a complete list of all supported languages, the appropriate language code and the level of support each language currently receives.

Continent Language Code Basic Support Sentiment Analysis Topic Extraction
Europe Albanian sq X X  
  Armenian hy X    
  Belarusian be X    
  Bosnian bs X X  
  Bulgarian bg X

X

 
  Catalan, Valencian ca X    
  Cherokee chr X    
  Croatian hr X X  
  Czech cs X X  
  Dutch nl X X X
  English en X X X
  Estonian  et X    
  French fr X X X
  Georgian ka X    
  German de X X X
  Greek el X X X
  Hawaiian haw X    
  Hungarian hu X X  
  Irish ga X    
  Italian it X X X
  Latvian iv X    
  Lithuanian lt X    
  Luxembourgish, Letzeburgesch lb X    
  Macedonian mk X    
  Maltese mt X    
  Montenegrin sr-ME X    
  Polish  pl X X X
  Portuguese pt X X X
  Romanian  ro X X X
  Russian ru X X X
  Scots sco X    
  Serbian sr X X  
  Slovak sk X X  
  Slovenian sl X X  
  Spanish es X X X
  Turkish tr X X X
  Ukrainian uk X    
  Welsh cy X    
           
Scandinavian Danish da X X X
  Faroese fo X    
  Finnish fi X    
  Icelandic is X    
  Norwegian no X X X
  Norwegian Nynorsk nn X    
  Swedish sv X X X
           
African Afrikaans af X    
  Akan ak X    
  Ga gaa X    
  Ganda lg X    
  Igbo ig X    
  Krio kri X    
  Lozi loz X    
  Luba-Kasai lua X    
  Luo luo X    
  Mauritian mfe X    
  Northern Sotho nso X    
  Seychellois crs X    
  Somali so X    
  Sundanese su    
  Swahili sw    
  Tumbuka tum X    
  Zulu zu X    
           
Asian Azerbaijani az X    
  Bengali bn X  
  Bihari languages bh X    
  Burmese my X    
  Cebuano ceb X    
  Chinese zh X X X
  Chinese Traditional zh-Hant X X  
  Gujarati gu X    
  Hindi hi X X X
  Hmong hmn X    
  Indonesian id X X X
  Japanese ja X X X
  Kapampangan pam X    
  Khasi kha X    
  Korean ko X X X
  Kurdish ku X    
  Limbu lif X    
  Malay ms X X  
  Malayalam ml X    
  Mixed Hindi English   X X  
  Mongolian mn X    
  Nepali ne X    
  Newar new X    
  Rajasthani raj X    
  Sanskrit sa X    
  Sinhala si X    
  Tamil ta X    
  Thai th X X X
  Urdu ur X    
  Vietnamese vi X X  
  Waray war X    
  Zhuang za X    
           
Australasian Fijian fj X    
  Javanese jv X    
  Maori mi X    
  Samoan sm X    
  Tagalog tl X X X
  Tonga to X    
           
Middle Eastern Arabic ar X X X
  Farsi/Persian fa X X X
  Hebrew he X X X
  Syriac syr X    
  Yiddish yi X    
           
Other Abkhazian ab X    
  Afar aa X    
  Amharic am X    
  Assamese as X    
  Aymara ay X    
  Bashkir ba X    
  Basque eu X    
  Bislama bi X    
  Breton br    
  Central Khmer km X    
  Chewa ny X    
  Corsican co X    
  Divehi dv X    
  Dzongkha dz X    
  Esperanto eo X    
  Ewe ee X    
  Gaelic gd X    
  Galician gl X    
  Guarani gn X    
  Haitian ht X    
  Hausa ha X    
  Interlingua ia X    
  Interlingue, Occidental ie X    
  Inuktitut iu X    
  Inupiaq ik X    
  Kalaallisut kl X    
  Kannada kn X    
  Kashmiri ks X    
  Kazakh kk X    
  Kinyarwanda rw X    
  Kyrgyz ky X    
  Lao lo X    
  Latin la X    
  Lingala ln X    
  Malagasy mg X    
  Manx gv X    
  Marathi mr X    
  Nauru na X    
  Occitan oc X    
  Oriya or X    
  Oromo om X    
  Ossetic os X    
  Panjabi pa X    
  Pashto ps X    
  Quechua qu X    
  Romansh rm X    
  Rundi rn X    
  Sango sg X    
  Shona sn X    
  Sindhi sd X    
  South Ndebele nr X    
  Southern Sotho st X    
  Swati ss X    
  Tajik tg X    
  Tatar tt X    
  Telugu te X  
  Tibetan bo X    
  Tigrinya ti X    
  Tsonga ts X    
  Tswana tn X    
  Turkmen tk X    
  Twi tw X    
  Uighur ug X    
  Uzbek uz X    
  Venda ve X    
  Volapük vo X    
  Western Frisian fy X    
  Wolof wo X    
  Xhosa xh X    
  Yoruba yo X    

Check out the link below to download a copy of the language codes.

Was this article helpful?
2 out of 2 found this helpful