官术网_书友最值得收藏!

Handling unexpected types

We just saw that CSV data is read into Go as [][]string. However, Go is statically typed, which allows us to enforce strict checks for each of the CSV fields. We can do this as we parse each field for further processing. Consider some messy data that has random fields that don't match the type of the other values in a column:

4.6,3.1,1.5,0.2,Iris-setosa
5.0,string,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
5.3,3.7,1.5,0.2,Iris-setosa
5.0,3.3,1.4,0.2,Iris-setosa
7.0,3.2,4.7,1.4,Iris-versicolor
6.4,3.2,4.5,1.5,
6.9,3.1,4.9,1.5,Iris-versicolor
5.5,2.3,4.0,1.3,Iris-versicolor
4.9,3.1,1.5,0.1,Iris-setosa
5.0,3.2,1.2,string,Iris-setosa
5.5,3.5,1.3,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
4.4,3.0,1.3,0.2,Iris-setosa

To check the types of the fields in our CSV records, let's create a struct variable to hold successfully parsed values:

// CSVRecord contains a successfully parsed row of the CSV file.
type CSVRecord struct {
SepalLength float64
SepalWidth float64
PetalLength float64
PetalWidth float64
Species string
ParseError error
}

Then, before we loop over the records, let's initialize a slice of these values:

// Create a slice value that will hold all of the successfully parsed
// records from the CSV.
var csvData []CSVRecord

Now as we loop over the records, we can parse into the relevant type for that record, catch any errors, and log as needed:


// Read in the records looking for unexpected types.
for {

// Read in a row. Check if we are at the end of the file.
record, err := reader.Read()
if err == io.EOF {
break
}

// Create a CSVRecord value for the row.
var csvRecord CSVRecord

// Parse each of the values in the record based on an expected type.
for idx, value := range record {

// Parse the value in the record as a string for the string column.
if idx == 4 {

// Validate that the value is not an empty string. If the
// value is an empty string break the parsing loop.
if value == "" {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Empty string value")
break
}

// Add the string value to the CSVRecord.
csvRecord.Species = value
continue
}

// Otherwise, parse the value in the record as a float64.
var floatValue float64

// If the value can not be parsed as a float, log and break the
// parsing loop.
if floatValue, err = strconv.ParseFloat(value, 64); err != nil {
log.Printf("Unexpected type in column %d\n", idx)
csvRecord.ParseError = fmt.Errorf("Could not parse float")
break
}

// Add the float value to the respective field in the CSVRecord.
switch idx {
case 0:
csvRecord.SepalLength = floatValue
case 1:
csvRecord.SepalWidth = floatValue
case 2:
csvRecord.PetalLength = floatValue
case 3:
csvRecord.PetalWidth = floatValue
}
}

// Append successfully parsed records to the slice defined above.
if csvRecord.ParseError == nil {
csvData = append(csvData, csvRecord)
}
}
主站蜘蛛池模板: 界首市| 嘉鱼县| 密山市| 焦作市| 余干县| 庆安县| 大洼县| 黄龙县| 文登市| 澄城县| 仁寿县| 井冈山市| 宽城| 修武县| 罗城| 尉氏县| 榆林市| 昆山市| 隆昌县| 呼伦贝尔市| 枣强县| 长垣县| 舟曲县| 分宜县| 日照市| 遂溪县| 江北区| 西乡县| 紫云| 绍兴市| 东宁县| 云和县| 海城市| 金寨县| 邯郸县| 利津县| 成武县| 青河县| 白银市| 兰考县| 长岭县|