Research and Implementation of Tibetan Digital Library
|Keywords||Unicode OCR Dublin Core XML Digital Library Tibetan coded Full Text Search|
With computer technology, communication technology and the rapid development of network technology , the international information highway construction and utilization of large-scale information systems , library systems for the development of the environment and conditions. Currently, the network information management technology, digital processing technology and digital information resources construction has become the focus of international competition , countries are put in a considerable strength in research and development . Digital Library (Digital Library) this new concept , a new model emerged, twenty-first century and is considered the main direction of development of information industry one . From the early nineties to now, many of the digital library-related technologies such as OCR recognition technology, text retrieval technology continues to mature , the international standard coded character sets and Unicode metadata indexing specification Dublin Core standards, are the rapid development of digital libraries provide good support . At present, China Tibetan digital implementation are based CodePage pattern of development , with the Tibetan character coding region coverage area , making it difficult to achieve coexistence Tibetan retrieval , there is little Tibetan retrieval system for Windows platforms . Tibetan demonstration digital library system is established to solve the above problems developed . In this paper, physical resources digitization , metadata indexing, full-text search and other key technologies discussed in detail . In ISO / IEC 10646/Unicode international coding system structure, the Tibetan word features and text structures on the basis of written specifications , in accordance with the digital library development process , to discuss building a digital library of Tibetan architecture ; focusing on international standard Tibetan large character set ( vertical pre- mix ) and a small set of characters ( dynamic combination ) two encoding principle and technical solutions to be taken were explored using synonyms associative retrieval to solve the Tibetan retrieval of two coding modes problem . Proposed preferred large character set encoding mode , and give a brief application examples. In this paper, Unicode, Dublin Core, XML standard made ??a more detailed discussion , demonstration Tibetan digital library system is also developed in strict accordance with the above criteria . This system was established to provide a digital library for Tibetan demonstration , conceptual platform , we hope to establish digital libraries other minority languages ??provide a reference or reference.