官术网_书友最值得收藏!

  • Machine Learning With Go
  • Daniel Whitenack
  • 310字
  • 2021-07-08 10:37:26

Handling unexpected fields

The preceding methods work fine with clean CSV data, but, in general, we don't encounter clean data. We have to parse messy data. For example, you might find unexpected fields or numbers of fields in your CSV records. This is why reader.FieldsPerRecord exists. This field of the reader value lets us easily handle messy data, as follows:

4.3,3.0,1.1,0.1,Iris-setosa
5.8,4.0,1.2,0.2,Iris-setosa
5.7,4.4,1.5,0.4,Iris-setosa
5.4,3.9,1.3,0.4,blah,Iris-setosa
5.1,3.5,1.4,0.3,Iris-setosa
5.7,3.8,1.7,0.3,Iris-setosa
5.1,3.8,1.5,0.3,Iris-setosa

This version of the iris.csv file has an extra field in one of the rows. We know that each record should have five fields, so let's set our reader.FieldsPerRecord value to 5:

// We should have 5 fields per line. By setting
// FieldsPerRecord to 5, we can validate that each of the
// rows in our CSV has the correct number of fields.
reader.FieldsPerRecord = 5

Then as we are reading in records from the CSV file, we can check for unexpected fields and maintain the integrity of our data:

// rawCSVData will hold our successfully parsed rows.
var rawCSVData [][]string

// Read in the records looking for unexpected numbers of fields.
for {

// Read in a row. Check if we are at the end of the file.
record, err := reader.Read()
if err == io.EOF {
break
}

// If we had a parsing error, log the error and move on.
if err != nil {
log.Println(err)
continue
}

// Append the record to our dataset, if it has the expected
// number of fields.
rawCSVData = append(rawCSVData, record)
}

Here, we have chosen to handle the error by logging the error, and we only collect successfully parsed records into rawCSVData. The reader will note that this error could be handled in many different ways. The important thing is that we are forcing ourselves to check for an expected property of the data and increasing the integrity of our application.

主站蜘蛛池模板: 榆中县| 中牟县| 且末县| 卫辉市| 仪征市| 湘潭市| 敦化市| 东台市| 闽侯县| 唐海县| 海淀区| 桐城市| 凤阳县| 田林县| 诏安县| 万山特区| 宜州市| 宝坻区| 丽江市| 宁安市| 柏乡县| 鸡东县| 大兴区| 武乡县| 凌源市| 和林格尔县| 涞源县| 延长县| 双柏县| 当阳市| 修水县| 郯城县| 靖宇县| 融水| 韶关市| 宜兰县| 同心县| 长兴县| 无极县| 兴隆县| 沂源县|