解码图创建示例（训练阶段）

在训练阶段的图建立脚本是比在测试阶段的简单，主要是因为我们没有不需要消歧符号。

我们假设你已经读过了测试阶段的recipe。

在训练阶段我们用测试阶段相同的HCLG形式，除了G是相当于训练脚本构成的一个线性接受器(当然，这个建立是很容易扩展到在那些脚本的不确定)。

Command-line programs involved in decoding-graph creation

这里我们将解释在训练脚本时发生了什么；下面我们在看程序里发生了什么。假设我们已经建立了一个树和一个model。接下里的命令建立一个包含对应每个训练transcripts的图HCLG(c.f. The Table concept)。

compile-train-graphs $dir/tree $dir/1.mdl data/L.fst ark:data/train.tra \
  ark:$dir/graphs.fsts

输入文件train.tra 是一个包含训练 transcripts的整数版本，比如典型的命令行就是：

011c0201 110906 96419 79214 110906 52026 55810 82385 79214 51250 106907 111943 99519 79220

第一个token就是utterance id。程序的输出是一个archive graphs.fsts；它包含在train.tra的每一个句子的一个FST (in binary form)。对应HCLG的这个FST,除了没有转移概率(默认的, compile-train-graphs有–self-loop-scale=0和–transition-scale=0)。这是因为这些图是用来多阶段训练和转移概率将被改变，所以我们在后面加入他们。但是FSTs将有从静音概率(这些概率被编码在L.fst),上升的概率，和如果我们使用发音概率，这些也将被显示出来。一个读取这些archives和对训练数据解码的命令是，接下来将产生state-level alignments (c.f. Alignments in Kaldi)；我们将简单的惠顾这个命令，尽管这页我们的焦点是图建立本身。

gmm-align-compiled \
 --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 \
  --beam=8 --retry-beam=40 \   $dir/$x.mdl ark:$dir/graphs.fsts \
  "ark:add-deltas --print-args=false scp:data/train.scp ark:- |" \
   ark:$dir/cur.ali

前三个参数就是概率尺度(c.f. Scaling of transition and acoustic probabilities)。transition-scale 和self-loop-scale 选择在这里的原因是在解码之前，程序需要添加转移概率到图上(c.f. Adding transition probabilities to FSTs)。接下来的选项是beams；我们使用一个初始化的beam，和然后如果对齐没能达到最后的状态，我们使用其他的beams。因为我们使用一个声学的尺度0.1，相对于声学似然比，我们将不得不对这些beams乘以10来得到figures。程序需要model；通过迭代，$x.mdl 可以扩展为1.mdl 或者2.mdl。它通过archive (graphs.fsts)来读图。在引号里的参数被认为是一个管道(minus the "ark:" and "|")，和被解释成utterance-id的archive索引的，包含特征。输出就是cur.ali，和如果写成txt格式，它看来像是上面描述的.tra文件，尽管整数现在对应的不是word-ids而是transition-ids (c.f. Integer identifiers used by TransitionModel)。

我们注意到这两个阶段(graph creation and decoding)可以通过一个命令行程序完成：gmm-align，可以编译你想要的图。因为图建立需要花费相对长的时间，当把他们写到磁盘里，任何图都需要不止一次写入。

Internals of graph creation

图建立都是在训练时间里完成的，无论是程序compile-train-graphs还是gmm-align，都是有相同的代码来完成的：类TrainingGraphCompiler。当一个接着一个编译图，他们的行为如下：

在初始化：

添加子序列自环(c.f. Making the context transducer) 到词典L.fst，确保按输出标签排序和存储。

当编译一个transcript, 它执行以下的步骤：

生成一个相对于词序列的线性接收器
和词典一起编译它 (using TableCompose) 得到一个含有输入是音素和输出是词的FST (LG) ；这个FST包含发音和静音的选项。
Create a new context FST "C" that will be expanded on demand.
Compose C with LG to get CLG; the composition uses a special Matcher that expands C on demand.
Call the function GetHTransducer to get the transducer H (this covers just the context-dependent phones that were seen).
Compose H with CLG to get HCLG (this will have transition-ids on the input and words on the output, but no self-loops)
Determinize HCLG with the function DeterminizeStarInLog; this casts to the log semiring (to preserve stochasticity) before determinizing with epsilon removal.
Minimize HCLG with MinimizeEncoded (this does transducer minimization without weight-pushing, to preserve stochasticity).
添加自环。

TrainingGraphCompiler类有一个函数CompileGraphs() ，这个函数是在一个batch里联合一些图。这个被用在工具compile-train-graphs来加速图的编译。主要的原因是可以帮助在一个batch里，当创建H转换器时，我们仅仅需要处理每一个看到的上下文窗，即使它被用在许多的CLG例子里。没用第一个添加消歧符号做确定性的原因是，在这个情况下HCLG是函数(因为任何一个输入标签序列转换为同一个string)和非周期的；任何一个非周期的FST有两个特性，和任何函数性的有两个特性的FST是确定性的。