OCR – Supported File Types

This article contains information about which file types are supported for the OCR feature

Reveal Processing supports over 120 different languages during the Optical Character Recognition ("OCR") process. This process handles multiple languages within the same document on one pass. As an example, if a scanned document has Chinese, Japanese, and Korean, Reveal OCR will extract all 3 languages on a single OCR pass. Extraction of OCR text to UTF-8 Unicode is an automated step during processing within the Reveal platform.

Review OCR only supports English. You can OCR other languages in Processing. This limitation does not affect the translation of native files and native file extractions.

The following languages are supported under Reveal Processing OCR.

Supported Languages
Afrikaans	Dutch	Ido	Malinke	Romany	Tinpo
Albanian	Esperanto	Indonesian	Maltese	Ruanda	Tongan
Arabic	Estonian	Interlingua	Maori	Rundi	Tun
Aymara	English	Italian	Mayan	Russian	Turkish
Basque	Eskimo	Japanese	Miao	Sami	Ukrainian
Bemba	Faroese	Kabardian	Minankabaw	Slovenian	Visayan
Blackfoot	Fijian	Kashubian	Mohawk	Somali	Welsh
Breton	Finnish	Kawa	Moldavian	Sotho	Wend
Bugotu	French	Kikuyu	Nahuatl	Spanish	Wolof
Bulgarian	Frisian	Kongo	Norwegian	Sundanese	Xhosa
Byelorussian	Friulian	Korean	Nyanja	Swahili	Zapotec
Catalan	Gaelic (Irish)	Kpelle	Occidental	Swazi	Zulu
Chamorro	Gaelic (Scottish)	Kurdish	Ojibway	Swedish
Chechen	Galician	Latin	Papiamento	Samoan
Chinese (Traditional)	Ganda	Latvian	Pidgin English	Sardinian
Chinese (Simplified)	German	Lithuanian	Polish	Serbian
Chuana	Greek	Luba	Portuguese	Shona
Corsican	Guarani	Lule	Portuguese (Brazilian)	Sioux
Croatian	Hani	Luxembourgian	Provençal	Slovak
Crow	Hawaiian	Macedonian	Quechua	Tagalog
Czech	Hungarian	Malagasy	Rhaetic	Tahitian
Danish	Icelandic	Malay	Romanian	Thai

Last Updated 8/18/2022