https://github.com/DamRsn/NeuralNote
https://github.com/BShakhovsky/PolyphonicPianoTranscription
| dataset | meta data | contents | with audio |
|---|---|---|---|
| 200DrumMachines | 7371 one-shots | yes | |
| AAM | onsets, pitches, instruments, melody instrument, keys, chords, tempo, beats | 3000 (artificial) tracks | yes |
| ACM_MIRUM | tempo | 1410 excerpts (60s) | yes |
| ACPAS | aligned audio and scores | 2189 performances of 497 scores | downloadable |
| AcousticBrainz-Genre | 15-31 genres with 265-745 subgenres | audio features for ove |
本文介绍如何提取提取声学特征用于Merlin训练。在语音合成中,属于声码器(vocoder)的内容。
Merlin可以使用两种vocoder,STRAIGHT或WORLD。WORLD的目标是提取60-dim MGC, variable-dim BAP (BAP dim: 1 for 16Khz, 5 for 48Khz), 1-dim LF0;STRAIGHT的目标是提取60-dim MGC, 25-dim BAP, 1-dim LF0。
新版本的WORLD_v2还在开发中,目标是提取60-dim MGC, 5-dim BAP, 1-dim LF0(MGC和BAP的维度支持微调)。
由于STRAIGHT的使用有严格的证书限制,本文,主要介绍WORLD。
| """Simple example on how to log scalars and images to tensorboard without tensor ops. | |
| License: BSD License 2.0 | |
| """ | |
| __author__ = "Michael Gygli" | |
| import tensorflow as tf | |
| from StringIO import StringIO | |
| import matplotlib.pyplot as plt | |
| import numpy as np |