本文先完全按照官方教程跑通一个合成流程,然后尝试在中文上进行合成。
虽然官方提供了一键安装方法:./scripts/setup_tools.sh $HTK_USERNAME $HTK_PASSWORD,但在我们的尝试中,未能成功。
以下是Debug过程
直接运行出现的错误为:
| #include <cassert> | |
| #include <cstddef> | |
| #include <cstdint> | |
| #include <iomanip> | |
| #include <iostream> | |
| #include <random> | |
| #include <stdexcept> | |
| #include <vector> | |
| #define BLOCK_DIM 32 |
| """ | |
| author: Timothy C. Arlen | |
| date: 28 Feb 2018 | |
| Calculate Mean Average Precision (mAP) for a set of bounding boxes corresponding to specific | |
| image Ids. Usage: | |
| > python calculate_mean_ap.py | |
| Will display a plot of precision vs recall curves at 10 distinct IoU thresholds as well as output |
| """ | |
| Author: Awni Hannun | |
| This is an example CTC decoder written in Python. The code is | |
| intended to be a simple example and is not designed to be | |
| especially efficient. | |
| The algorithm is a prefix beam search for a model trained | |
| with the CTC loss function. |
本文介绍如何提取提取声学特征用于Merlin训练。在语音合成中,属于声码器(vocoder)的内容。
Merlin可以使用两种vocoder,STRAIGHT或WORLD。WORLD的目标是提取60-dim MGC, variable-dim BAP (BAP dim: 1 for 16Khz, 5 for 48Khz), 1-dim LF0;STRAIGHT的目标是提取60-dim MGC, 25-dim BAP, 1-dim LF0。
新版本的WORLD_v2还在开发中,目标是提取60-dim MGC, 5-dim BAP, 1-dim LF0(MGC和BAP的维度支持微调)。
由于STRAIGHT的使用有严格的证书限制,本文,主要介绍WORLD。
| # based on https://github.com/google/seq2seq/blob/master/bin/tools/generate_beam_viz.py | |
| # extracts probabilities and sequences from .npz file generated during beam search. | |
| # and pickles a list of the length n_samples that has beam_width most probable tuples | |
| # (path, logprob, prob) | |
| # where probs are scaled to 1. | |
| import numpy as np | |
| import networkx as nx | |
| import pickle |
| development: | |
| adapter: mysql2 | |
| encoding: utf8 | |
| database: my_database | |
| username: root | |
| password: | |
| apt: | |
| - somepackage | |
| - anotherpackage |
| // A simple quickref for Eigen. Add anything that's missing. | |
| // Main author: Keir Mierle | |
| #include <Eigen/Dense> | |
| Matrix<double, 3, 3> A; // Fixed rows and cols. Same as Matrix3d. | |
| Matrix<double, 3, Dynamic> B; // Fixed rows, dynamic cols. | |
| Matrix<double, Dynamic, Dynamic> C; // Full dynamic. Same as MatrixXd. | |
| Matrix<double, 3, 3, RowMajor> E; // Row major; default is column-major. | |
| Matrix3f P, Q, R; // 3x3 float matrix. |