Embeddings for DNN speaker adaptive training

上传者：qqexplanatory33733 2021-01-24 04:51:56上传 .PDF文件 314.11 KB 热度 27次

Embeddings for DNN speaker adaptive training

In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs (DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be viewed as learning a mapping from each embedding to transformation parameters that are applied to the shared parameters of the DNN.We investigate different approaches to applying these transformations, and find that with a good training strategy, a multi-layer adaptation network applied to all hidden layers is no more effective than a single linear layer acting on the embeddings to transform the input features. In the second part of our work, we evaluate different embeddings (i-vectors, x-vectors and deep CNN embeddings) in an additional speaker recognition task in order to gain insight into what should characterize an embedding for DNN-SAT. We find the performance for speaker recognition of a given representation is not correlated with its ASR performance; in fact, ability to capture more speech attributes than just speaker identity was the most important characteristic of the embeddings for efficient DNN-SAT ASR. Our best models achieved relative WER gains of 4% and 9% over DNN baselines using speaker-level cepstral mean normalisation (CMN), and a fully speaker-independent model, respectively.

DNN说话人自适应训练的嵌入

在这项工作中，我们调查了嵌入在DNN说话人自适应训练（DNN-SAT）中的使用，重点是每个说话人少量的适应数据。DNN-SAT可以看作是学习从每次嵌入到应用于DNN共享参数的变换参数的映射。.. 我们研究了应用这些转换的不同方法，发现采用良好的训练策略，应用于所有隐藏层的多层自适应网络并不比作用于嵌入以转换输入特征的单个线性层有效。在我们工作的第二部分中，我们在其他说话人识别任务中评估了不同的嵌入（i矢量，x矢量和深CNN嵌入），以便深入了解DNN-SAT嵌入的特征。我们发现给定表示的说话人识别性能与其ASR性能无关。实际上，有效捕获DNN-SAT ASR的最重要特征是，不仅捕获说话者身份，还能够捕获更多语音属性。（阅读更多）

下载地址

用户评论

更多下载

下载地址

 立即下载

用户评论

发表评论

Embeddings for DNN speaker adaptive training

在这项工作中，我们调查了嵌入在DNN说话人自适应训练（DNN-SAT）中的使用，重点是每个说话人少量...

大小：314.11 KB | 2021-01-24 04:51:56

XVECTORS ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION中文.pdf

本人精翻的《X-VECTORSROBUSTDNNEMBEDDINGSFORSPEAKERRECOGN...

大小：0B | 2020-05-15 10:41:36

Deep Neural Network Embeddings for Text Indenpendent Speaker Verification中文.pdf

本人精翻的Deep Neural Network Embeddings for Text-Inden...

大小：455KB | 2020-07-23 19:54:47

Dropout Training as Adaptive Regularization.pdf

Dropout Training as Adaptive Regularization.pdf

大小：430KB | 2021-04-04 21:32:47

Adaptive Co Training SVM for Sentiment Classification on Tweets

Adaptive Co-Training SVM for Sentiment Classificat...

大小：128KB | 2021-02-10 04:20:32

IS Speaker

很强大的语音软件，效果很好

大小：0B | 2019-06-26 00:59:36

Text to Speaker

暂无介绍

大小：0B | 2018-12-25 20:51:52

speaker语音

暂无介绍

大小：0B | 2018-12-25 20:52:06

Exploring Embeddings源码

Exploration_emeddings

大小：1.04MB | 2021-04-19 01:44:22

On diamond sum of embeddings

关于嵌入的diamond和,万良霞,刘彦佩,本文提供了两个嵌入的diamond和的不可定向亏格的反例...

大小：145KB | 2020-08-10 23:40:40

Deep_Speaker speaker_recognition_system Keras实现Deep Speaker源码

深度演讲者:演讲者识别系统数据集: 参考文件:参考代码: : (感谢PhilippeRémy) 此...

大小：439.90MB | 2021-04-19 18:25:25

speaker note

Refer to resource 13102241597.pdf, the speaker not...

大小：0B | 2018-12-25 20:52:11

李宏毅机器学习笔记8Tips for Training DNN

【李宏毅机器学习笔记】1、回归问题(Regression) 【李宏毅机器学习笔记】2、error产生...

大小：3.4MB | 2021-01-16 08:09:30

Text Speaker

Text Speaker是一个专业的和高品质的可以将文本转换成声音的软件

大小：0B | 2018-12-25 20:52:09

叫号speaker

speaker，mirosoft，阅读~分诊~叫号~

大小：0B | 2018-12-25 20:52:34

PC Speaker driver

利用它可以模拟出一个虚拟的声卡，利用PC机箱上的喇叭播放声音。尽管效果不怎么理想，但有声音总比没有声...

大小：0B | 2018-12-28 15:45:32