In multi-lingual and multi-script countries the use of two or more scripts is quite common for information communication through news and advertisement videos transmitted across various television channels. The text present in videos plays an important role in automatic video indexing and retrieval. Hence, OCR of multi-lingual video-text is crucial.
The main objective of the competition is to identify scripts from the extracted video words. Different combinations of ten Indian scripts will be considered for the competition. The competition provides a platform for researchers around the globe to address the problem.
The competition aims to find generic algorithms/system for identifying video scripts irrespective of the scripts being considered. General objective of the competition is to evaluate the recently proposed method on script identification. The following scripts will be considered for the competition,
- English (Eng),
- Hindi (Hin),
- Bengali (Ben),
- Oriya (Ori),
- Gujrathi (Guj),
- Punjabi (Pun),
- Kannada (Kan),
- Tamil (Tam),
- Telegu (Tel), and
- Arabic (Arb).
The competition will be organized into four different tasks:
Task 1: Identifying scripts from eight different script triplets (Combinations of three scripts, keeping English and Hindi in all combinations), based on their use in the Indian sub-continent.
Task 2:Identifying the combination of scripts used in north India. This task involes identification of seven scripts, namely, English, Hindi, Bengali, Oriya, Gujrathi, Punjabi and Arabic.
Task 3:Identifying the combination of scripts used in south India. This task involes identification of five scripts, namely, English, Hindi, Kannada, Tamil and Telegu.
Task 4:Identifying the combination of all the ten scripts.