王玉娇,耿思,李宁.东巴古籍资源的数字化及数据管理[J].电子测量与仪器学报,2017,31(4):636-643
东巴古籍资源的数字化及数据管理
Digitalization and data management of Naxi Dongba manuscripts
  
DOI:10.13382/j.jemi.2017.04.021
中文关键词:  古籍资料数字化  信息抽取  XML数据管理
英文关键词:digitalization of ancient books  XML data management  information extraction
基金项目:国家社会科学基金重大项目(12&ZD234)、东巴经典古籍基础数字档案建设与设计(KF20161123206)、东巴经典传承体系国际共享平台视频设计(KF20161123207)资助项目
作者单位
王玉娇 北京信息科技大学计算机学院北京100101 
耿思 北京信息科技大学计算机学院北京100101 
李宁 北京信息科技大学计算机学院北京100101 
AuthorInstitution
Wang Yujiao School of Computer Science, Beijing Information Science & Technology University, Beijing 100101, China 
Geng Si School of Computer Science, Beijing Information Science & Technology University, Beijing 100101, China 
Li Ning School of Computer Science, Beijing Information Science & Technology University, Beijing 100101, China 
摘要点击次数: 2238
全文下载次数: 17007
中文摘要:
      目前大多数东巴经典原始手稿被十多个国家的著名机构收藏,学术研究处于分散形态,沟通不便。构建东巴古籍共享平台有利于经典文化的抢救与传承。针对东巴古籍资源的数字化以及数据存储的问题,在分析现有信息抽取方法以及数据存储方式的基础上,提出了《中国少数民族古籍总目提要(纳西卷)》纸质书籍的数字化方法,并使用元数据表示从纸质书籍中抽取的东巴古籍书目,最终使用XML数据库管理数字化后的内容。实验结果表明,提出的信息抽取方法能够针对东巴古籍书目的特殊结构正确地抽取内容,并提供结构化检索手段。验证了该方法的可行性、正确性。这项研究对于少数民族古籍的数字化以及半结构化数据管理具有重要的借鉴意义。
英文摘要:
      At present, most original classic manuscripts of Dongba script have been collected by well known institutions from more than ten countries. As academic researchersare decentralized, it is very inconvenient for them to communicate with each other. The construction of a sharing platform for ancient books of Dongba script is beneficial for emergency treatment and inheritance of classic culture. In allusion to digitalization and data storage of ancient book resources of Dongba script, a digitalization method is presented in this paper for printing books known as Annotated General Catalog of Ancient Books of Ethnic Minorities in China (Naxi Volume) based on the analysis of existing information extraction approaches and data storage modes. Moreover, metadata is also adopted to refer to the bibliography of ancient books of Dongba script, which are extracted from printing books. And ultimately, XML database is employed to manage the digitalized contents. According to the experimental results, the information extraction approach proposed in this paper is able to extract contents accurately direct at the elaborate structure of the bibliography for ancient books of Dongba script on one hand and provides structured retrieval means on the other hand. As a result, both feasibility and validity of such an approach areverified. This research has important reference meanings for the digitalization and semi structured data management of ancient books of ethnic minorities.
查看全文  查看/发表评论  下载PDF阅读器