官术网_书友最值得收藏!

How to do it...

The strategy that we defined previously is coded as follows (Please refer to the Audio classification.ipynb file in GitHub while implementing the code):

  1. Import the dataset:
import pandas as pd
data = pd.read_csv('/content/train.csv')
  1. Extract features for each audio input:
ids = data['ID'].values
def extract_feature(file_name):
X, sample_rate = librosa.load(file_name)
stft = np.abs(librosa.stft(X))
mfccs = np.mean(librosa.feature.mfcc(y=X,sr=sample_rate, n_mfcc=40).T,axis=0)
return mfccs

In the preceding code, we defined a function that takes file_name as input, extracts the 40 MFCC corresponding to the audio file, and returns the same.

  1. Create the input and the output dataset:
x = []
y = []
for i in range(len(ids)):
try:
filename = '/content/Train/'+str(ids[i])+'.wav'
y.append(data[data['ID']==ids[i]]['Class'].values)
x.append(extract_feature(filename))
except:
continue
x = np.array(x)

In the preceding code, we loop through one audio file at a time, extracting its features and storing it in the input list. Similarly, we will be storing the output class in the output list. Additionally, we will convert the output list into a categorical value that is one-hot-encoded:

y2 = []
for i in range(len(y)):
y2.append(y[i][0])
y3 = np.array(pd.get_dummies(y2))

The pd.get_dummies method works very similar to the to_categorical method we used earlier; however, to_categorical does not work on text classes (it works on numeric values only, which get converted to one-hot-encoded values).

  1. Build the model and compile it:
model = Sequential()
model.add(Dense(1000, input_shape = (40,), activation = 'relu'))
model.add(Dense(10,activation='sigmoid'))
from keras.optimizers import Adam
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['acc'])

The summary of the preceding model is as follows:

  1. Create the train and test datasets and then fit the model:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y3, test_size=0.30,random_state=10)
model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test), verbose = 1)

Once the model is fitted, you will notice that the model has 91% accuracy in classifying audio in the right class.

主站蜘蛛池模板: 乌鲁木齐市| 盘山县| 伊宁市| 惠来县| 贵溪市| 黑龙江省| 平原县| 黎川县| 衡阳县| 兴和县| 清徐县| 永胜县| 宁城县| 涪陵区| 绥阳县| 高邑县| 滁州市| 宕昌县| 邵阳市| 左云县| 庆城县| 通辽市| 平山县| 高安市| 绥棱县| 兴山县| 满城县| 含山县| 句容市| 綦江县| 古蔺县| 石屏县| 乐陵市| 乌审旗| 平顺县| 子洲县| 陆良县| 龙川县| 九龙县| 玉环县| 界首市|