![]() One relates to "the data" (directories like data/train/) and one relates to "the language" (directories like data/lang/). The output of the data preparation stage consists of two sets of things. There are more commands after these in the WSJ script that relate to training language models locally (rather than using the ones supplied by LDC), but the ones above are the most important ones. Utils/prepare_lang.sh data/local/dict "" data/local/lang_tmp data/lang || exit 1 Local/wsj_data_prep.sh $wsj0/?-.? || exit 1 In the WSJ case the commands are: wsj0=/export/corpora5/LDC/LDC93S6B Utils/prepare_lang.sh data/local/dict '!SIL' data/local/lang data/lang || exit 1 In the case of RM these commands are: local/rm_data_prep.sh /export/corpora5/LDC/LDC93S3A/rm_comp || exit 1 For example, in the Resource Management (RM) setup it is local/rm_data_prep.sh. The parts in the sub-directory named local/ are always specific to the database. ![]() egs/rm/s5/run.sh) have a few commands at the top of them that relate to various phases of data preparation. In addition to this page, you can refer to the data preparation scripts in those directories. ![]() This page will assume that you are using the latest version of the example scripts (typically named "s5" in the example directories, e.g. This section explains how to prepare the data. After running the example scripts (see Kaldi tutorial), you may want to set up Kaldi to run with your own data.
0 Comments
Leave a Reply. |