- 中國語音學報(第11輯)
- 中國社會科學院語言研究所主辦
- 607字
- 2021-01-06 19:11:09
xSegmenter:音段自動切分與標注工具
熊子瑜
摘要 音段自動切分與標注工具xSegmenter采用Perl腳本語言編寫,主要面向較大規模的具有文字底稿的語音語料庫建設,旨在解決音段標注的效率和一致性問題。該程序通過調用HTK工具,基于用戶所提供的語料及相關資源訓練生成HMM聲學模型,然后進行音段強制對齊,最后針對每個聲音文件轉寫生成相應的語音標注文件(*.TextGrid),包括詞語層、音節層和音素層等標注內容。由于該工具自身不提供任何聲學模型和詞典,而是基于用戶所提供的聲音文件、帶有分詞信息的發音文本文件和發音詞典文件等數據資源自動訓練生成相應的語音聲學模型,然后再利用所生成的語音聲學模型去完成音段的自動切分和標注任務,因此這一工具可適用于任意語言和方言的語音庫建設。
關鍵詞 xSegmenter;音段切分;語音語料庫
xSegmenter:a Tool for Automatic Segmentation and Annotation
XIONG Ziyu
Abstract The automatic segmentation and annotation toolxSegmenteris written in Perl script language.It is developed for the construction of large-scale speech corpora with text scripts,which aims to solve the problem of efficiency and consistency of segment annotation.This program calls the HTK toolkit to generate a HMM acoustic model based on the speech corpus and related resources provided by the user,and then compulsorily aligns the segments.Finally,for each sound file,it generates a corresponding speech annotation file (* .TextGrid) including word layer,syllable layer,and phoneme layer.Because the tool itself does not provide any acoustic models and dictionaries,it automatically trains and generates the corresponding acoustic models based on data resources provided by the user,including sound files,pronunciation text files with word segmentation information,and pronunciation dictionary files.The generated acoustic model is then used to complete the automatic segmentation and annotation tasks of the segment,so this tool can be applied to the construction of a speech corpus for any language and dialect.
Key words xSegmenter,Speech segmentation,Speech corpus
音段的手工切分與標注是一項費時費力的工作,但對于語音庫建設和利用而言,細致而系統的音段切分與標注卻是不可或缺的。本文主要介紹筆者開發的音段自動切分與標注工具xSegmenter的用法及其注意事項。