官术网_书友最值得收藏!

Skews

Now let's look at how the data for the house prices are distributed:

func hist(a []float64) (*plot.Plot, error){
h, err := plotter.NewHist(plotter.Values(a), 10)
if err != nil {
return nil, err
}
p, err := plot.New()
if err != nil {
return nil, err
}

h.Normalize(1)
p.Add(h)
return p, nil
}

This section is added to the main function:

hist, err := plotHist(YsBack)
mHandleErr(err)
hist.Title.Text = "Histogram of House Prices"
mHandleErr(hist.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist.png"))

The following diagram is:

Histogram of House prices

As can be noted, the histogram of the prices is a little skewed. Fortunately, we can fix that by applying a function that performs the logging of the value and then adds 1. The standard library provides a function for this: math.Log1p. So, we add the following to our main function:

for i := range YsBack {
YsBack[i] = math.Log1p(YsBack[i])
}
hist2, err := plotHist(YsBack)
mHandleErr(err)
hist2.Title.Text = "Histogram of House Prices (Processed)"
mHandleErr(hist2.Save(25*vg.Centimeter, 25*vg.Centimeter, "hist2.png"))

The following diagram is :

Histogram of House Prices (Processed)

Ahh! This looks better. We did this for all the Ys. What about any of the Xs? To do that, we will have to iterate through each column of Xs, find out if they are skewed, and if they are, we need to apply the transformation function.

This is what we add to the main function:

  it, err := native.MatrixF64(Xs)
mHandleErr(err)
for i, isCat := range datahints {
if isCat {
continue
}
skewness := skew(it, i)
if skewness > 0.75 {
log1pCol(it, i)
}
}

native.MatrixF64s takes a *tensor.Dense and converts it into a native Go iterator. The underlying backing data doesn't change, therefore if one were to write it[0][0] = 1000, the actual matrix itself would change too. This allows us to perform transformations without additional allocations. For this topic, it may not be as important; however, for larger projects, this will come to be very handy.

This also allows us to write the functions to check and mutate the matrix:

// skew returns the skewness of a column/variable
func skew(it [][]float64, col int) float64 {
a := make([]float64, 0, len(it[0]))
for _, row := range it {
for _, col := range row {
a = append(a, col)
}
}
return stat.Skew(a, nil)
}

// log1pCol applies the log1p transformation on a column
func log1pCol(it [][]float64, col int) {
for i := range it {
it[i][col] = math.Log1p(it[i][col])
}
}
主站蜘蛛池模板: 连江县| 大埔县| 固安县| 兴和县| 塔城市| 东方市| 建瓯市| 长沙县| 泸水县| 威海市| 黄石市| 连州市| 北海市| 洛浦县| 光山县| 赤水市| 淳安县| 子长县| 扎兰屯市| 石泉县| 雅江县| 吴旗县| 临猗县| 涪陵区| 都匀市| 清徐县| 台中县| 青冈县| 南康市| 民权县| 浦城县| 盐边县| 思茅市| 河南省| 萝北县| 吉水县| 林西县| 和硕县| 吴桥县| 顺昌县| 昌吉市|