官术网_书友最值得收藏!

Classifying common audio

In the previous sections, we have understood the strategy to perform modeling on a structured dataset and also on unstructured text data.

In this section, we will be learning about performing a classification exercise where the input is raw audio.

The strategy we will be adopting is that we will be extracting features from the input audio, where each audio signal is represented as a vector of a fixed number of features.

There are multiple ways of extracting features from an audio—however, for this exercise, we will be extracting the Mel Frequency Cepstral Coefficients (MFCC) corresponding to the audio file.

Once we extract the features, we shall perform the classification exercise in a way that is very similar to how we built a model for MNIST dataset classification—where we had hidden layers connecting the input and output layers.

In the following section, we will be performing classification on top of an audio dataset where there are ten possible classes of output.

主站蜘蛛池模板: 南皮县| 新绛县| 江油市| 普宁市| 筠连县| 鹤山市| 正镶白旗| 花莲市| 庆阳市| 扶绥县| 吉林省| 儋州市| 雅安市| 若尔盖县| 昌乐县| 台北县| 宣汉县| 广灵县| 霍林郭勒市| 洪湖市| 馆陶县| 托克逊县| 钟山县| 胶南市| 南丹县| 台江县| 海南省| 乐平市| 玛曲县| 淳化县| 阜平县| 滁州市| 海伦市| 财经| 拜泉县| 岳池县| 花莲县| 普陀区| 张家口市| 大邑县| 平谷区|