贾林锋,吴黎明,温腾腾,廖禹韬,高梓皓.多尺度卷积的时频域语音分离方法研究[J].电子测量与仪器学报,2022,36(11):134-140 |
多尺度卷积的时频域语音分离方法研究 |
Speech separation in time-and-frequency domainbased on multi-scale convolution |
|
DOI: |
中文关键词: 语音分离 特征融合 多尺度卷积 时频域特征 |
英文关键词:speech separation feature fusion multiscale convolution time-frequency domain characteristics |
基金项目:国家自然科学基金(61705045)、佛山广工大研究院创新创业人才团队计划项目 (20191108)资助 |
|
|
摘要点击次数: 1393 |
全文下载次数: 1561 |
中文摘要: |
在进行混合语音分离时,信号时域特征的深度学习语音分离性能优于频域特征。 但目前时域特征的语音分离方法在真
实噪声环境下的鲁棒性较差,且单一时域特征对分离模型的性能存在局限性。 因此,提出一种基于 Conv-TasNet 网络的多特征
语音分离方法,融合频域特征与时域特征,提高数据的多维信息。 为了进一步提高分离网络性能,引入多尺度卷积块,提高网络
对特征的提取能力。 在包含真实噪声的实验环境下,所提方法与 Conv-TasNet 模型和最新的时频域融合语音分离基线模型相
比,性能分别提高了 0. 91 和 0. 52 dB,有效提升了语音分离的性能及鲁棒性。 |
英文摘要: |
In mixed speech separation, the performance of signal time-domain features is better than that of frequency-domain features.
However, the current speech separation methods based on time domain feature have poor robustness in real noise environment, and single
time domain feature has limitations on the performance of the separation model. Therefore, a multi-feature speech separation method
based on Conv-TasNet network is proposed, which integrates frequency domain features and time domain features to improve
multidimensional information of data. In order to further improve the performance of separation network, multi-scale convolution block is
introduced to improve the feature extraction ability of network. Compared with the Conv-TasNet model and the latest time-frequency
fusion speech separation baseline model, the performance and robustness of the proposed method are improved by 0. 91 and 0. 52 dB
respectively in the experimental environment containing real noise. |
查看全文 查看/发表评论 下载PDF阅读器 |
|
|
|