官术网_书友最值得收藏!

The Trainer class

In the Trainer class, we will build a new pipeline to train our model. The FeaturizeText transform builds NGrams from the strings data we previously extracted from the files. NGrams are a popular method to create vectors from a string to, in turn, feed the model. You can think of NGrams as breaking a longer string into ranges of characters based on the value of the NGram parameter. A bi-gram, for instance, would take the following sentence, ML.NET is great and convert it into ML-.N-ET-is-gr-ea-t. Lastly, we build the SdcaLogisticRegression trainer object:

var dataProcessPipeline = MlContext.Transforms.CopyColumns("Label", nameof(FileInput.Label))
.Append(MlContext.Transforms.Text.FeaturizeText("NGrams", nameof(FileInput.Strings)))
.Append(MlContext.Transforms.Concatenate("Features", "NGrams"));

var trainer = MlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features");
For those looking to deep dive further into the Transforms Catalog API, check out the documentation from Microsoft here: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformscatalog?view=ml-dotnet.
主站蜘蛛池模板: 林西县| 鹿泉市| 杭锦后旗| 南华县| 琼结县| 金秀| 仙居县| 辽阳县| 孝感市| 旌德县| 任丘市| 龙里县| 德阳市| 建平县| 宜春市| 民权县| 余姚市| 云梦县| 台东市| 纳雍县| 古浪县| 密山市| 武陟县| 永城市| 平塘县| 康定县| 荥经县| 友谊县| 南丰县| 汝城县| 泾阳县| 兴安县| 凭祥市| 遂平县| 繁昌县| 抚松县| 石楼县| 永定县| 仙桃市| 双流县| 定州市|