官术网_书友最值得收藏!

TensorForest Estimator

TensorForest is a highly scalable implementation of random forests built by combining a variety of online HoeffdingTree algorithms with the extremely randomized approach.

Google published the details of the TensorForest implementation in the following paper:  TensorForest: Scalable Random Forests on TensorFlow  by Thomas Colthurst, D. Sculley, Gibert Hendry, Zack Nado, presented at Machine Learning Systems Workshop at the Conference on Neural Information Processing Systems ( NIPS) 2016. The paper is available at the following link:  https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxtbHN5c25pcHMyMDE2fGd4OjFlNTRiOWU2OGM2YzA4MjE.

TensorForest estimators are used to implementing the following algorithm:

Initialize the variables and sets
Tree = [root]
Fertile = {root}
Stats(root) = 0
Splits[root] = []

Divide training data into batches.
For each batch of training data:
Compute leaf assignment for each feature vector
Update the leaf stats in Stats
For each in Fertile set:
if |Splits| < max_splits
then add the split on a randomly selected feature to Splits
else if is fertile and |Splits| = max_splits
then update the split stats for
Calculate the fertile leaves that are finished.
For every non-stale finished leaf:
turn the leaf into an internal node with its best scoring split
remove the leaf from Fertile
add the leaf's two children to Tree as leaves
If |Fertile| < max_fertile
Then add the max_fertile ? |Fertile| leaves with
the highest weighted leaf scores to Fertile and
initialize their Splits and split statistics.
Until |Tree| = max_nodes or |Tree| stays the same for max_batches_to_grow batches

More details of this algorithm implementation can be found in the TensorForest paper.

主站蜘蛛池模板: 额济纳旗| 都昌县| 永嘉县| 张北县| 房产| 盐池县| 余姚市| 辽阳市| 离岛区| 苍溪县| 闽清县| 五家渠市| 兴海县| 惠州市| 白山市| 连江县| 临漳县| 永川市| 科技| 龙岩市| 华亭县| 嵊泗县| 泾阳县| 崇明县| 青龙| 十堰市| 保山市| 吉水县| 华容县| 桃江县| 平邑县| 云阳县| 革吉县| 雅江县| 南木林县| 潮安县| 正定县| 兖州市| 彭山县| 广德县| 额济纳旗|