官术网_书友最值得收藏!

The conditional expectation functions

Instead, let's do what we originally set out to do: explore the CEFs of the variables. Fortunately, we already have the necessary data structures (in other words, the index), so writing the function to find the CEF is relatively easy.

The following is the code block:

func CEF(Ys []float64, col int, index []map[string][]int) map[string]float64 {
retVal := make(map[string]float64)
for k, v := range index[col] {
var mean float64
for _, i := range v {
mean += Ys[i]
}
mean /= float64(len(v))
retVal[k]=mean
}
return retVal
}

This function finds the conditionally expected house price when a variable is held fixed. We can do an exploration of all the variables, but for the purpose of this chapter, I shall only share the exploration of one –the yearBuilt variable—as an example.

Now, YearBuilt is an interesting variable to dive deep into. It's a categorical variable (1950.5 makes no sense), but it's totally orderable as well (1,945 is smaller than 1,950). And there are many values of YearBuilt. So, instead of printing it out, we shall plot it out with the following function:

// plotCEF plots the CEF. This is a simple plot with only the CEF. 
// More advanced plots can be also drawn to expose more nuance in understanding the data.
func plotCEF(m map[string]float64) (*plot.Plot, error) {
ordered := make([]string, 0, len(m))
for k := range m {
ordered = append(ordered, k)
}
sort.Strings(ordered)

p, err := plot.New()
if err != nil {
return nil, err
}

points := make(plotter.XYs, len(ordered))
for i, val := range ordered {
// if val can be converted into a float, we'll use it
// otherwise, we'll stick with using the index
points[i].X = float64(i)
if x, err := strconv.ParseFloat(val, 64); err == nil {
points[i].X = x
}

points[i].Y = m[val]
}
if err := plotutil.AddLinePoints(p, "CEF", points); err != nil {
return nil, err
}
return p, nil
}

Our ever-growing main function now has this appended to it:

ofInterest := 19 // variable of interest is in column 19
cef := CEF(YsBack, ofInterest, indices)
plt, err := plotCEF(cef)
mHandleErr(err)
plt.Title.Text = fmt.Sprintf("CEF for %v", hdr[ofInterest])
plt.X.Label.Text = hdr[ofInterest]
plt.Y.Label.Text = "Conditionally Expected House Price"
mHandleErr(plt.Save(25*vg.Centimeter, 25*vg.Centimeter, "CEF.png"))

Running the program yields the following chart:

conditional  expectation  functions for Yearbuilt

Upon inspecting the chart, I must confess that I was a little surprised. I'm not particularly familiar with real estate, but my initial instincts were that older houses would cost more—houses, in my mind, age like fine wine; the older the house, the more expensive it would be. Clearly this is not the case. Oh well, live and learn.

The CEF exploration should be done for as many variables as possible. I am merely eliding for the sake of brevity in this book.

主站蜘蛛池模板: 浙江省| 尤溪县| 沂源县| 兴化市| 桃园县| 天柱县| 遂平县| 都昌县| 疏勒县| 衡山县| 白城市| 临夏市| 丹江口市| 舒兰市| 内乡县| 花莲市| 龙门县| 澎湖县| 平顺县| 江口县| 呼和浩特市| 唐海县| 慈溪市| 东辽县| 顺义区| 白山市| 陕西省| 朝阳区| 山西省| 邢台县| 闽清县| 通河县| 滁州市| 肇东市| 龙井市| 保亭| 仁寿县| 建始县| 芜湖市| 巴楚县| 武夷山市|