Improved structural repairing algorithm based on regular expression
中文关键词:  数据清洗  结构化修复  正则表达式  编辑距离
英文关键词:data cleaning  structural repairing  regular expression  edit distance
基金项目:辽宁省自然基金(2015020098)、辽宁工程技术大学博士启动基金(2015 1147)资助项目
陈万志 辽宁工程技术大学 电子与信息工程学院葫芦岛125105 
宋剑 辽宁工程技术大学 电子与信息工程学院葫芦岛125105 
王德建 渤海装备辽河重工有限公司盘锦124010 
王星 辽宁工程技术大学 电子与信息工程学院葫芦岛125105 
Chen Wanzhi School Electronics and Information Engineering, Liaoning Technical University, Huludao 125105, China 
Song Jian School Electronics and Information Engineering, Liaoning Technical University, Huludao 125105, China 
Wang Dejian China Petroleum Liaohe Equipment Company, Panjin 124010, China 
Wang Xing School Electronics and Information Engineering, Liaoning Technical University, Huludao 125105, China 
摘要点击次数: 2116
全文下载次数: 6965
      Aiming at the structural data cleaning, an improved structural repairing algorithm based on regular expression was proposed according to calculate the edit distance between strings. Firstly, the violation partial order edge from edge set of nondeterministic finite automata was extracted, then the edit distance for edge in it was only revised by priority queue. At the same time, others edge to satisfy the partial order relation could calculate by recursive formula instead of the complex priority queue. The experimental results show that the improved algorithm not only has obvious advantage in time complexity, but also the improvement rate is significant and stable comparted with the original algorithm.
查看全文  查看/发表评论  下载PDF阅读器