官术网_书友最值得收藏!

  • Machine Learning With Go
  • Daniel Whitenack
  • 201字
  • 2021-07-08 10:37:30

Putting data into data repositories

Let's say that we have a simple text file:

$ cat blah.txt 
This is an example file.

If this file is part of the data we are utilizing in our ML workflow, we should version it. To version this file in our repository, myrepo, we just need to commit it into that repository:

$ pachctl put-file myrepo master -c -f blah.txt 

The -c flag specifies that we want Pachyderm to open a new commit, insert the file we are referencing, and close the commit all in one shot. The -f flag specifies that we are providing a file.

Note that we are committing a single file to the master branch of a single repository here. However, the Pachyderm API is incredibly flexible. We can commit, delete, or otherwise modify many versioned files in a single commit or over multiple commits. Further, these files could be versioned via a URL, object store link, database dump, and so on.

As a sanity check, we can confirm that our file was versioned in the repository:

$ pachctl list-repo
NAME CREATED SIZE
myrepo 10 minutes ago 25 B
$ pachctl list-file myrepo master
NAME TYPE SIZE
blah.txt file 25 B
主站蜘蛛池模板: 石棉县| 林口县| 庆元县| 漾濞| 通州市| 志丹县| 错那县| 邛崃市| 岳普湖县| 克拉玛依市| 霍林郭勒市| 塔城市| 鹤壁市| 高雄县| 大冶市| 萨迦县| 连南| 孙吴县| 邵阳县| 津南区| 宝清县| 淮滨县| 年辖:市辖区| 浦东新区| 通许县| 监利县| 兰溪市| 榆林市| 三台县| 拉孜县| 津市市| 铁岭市| 五寨县| 合作市| 女性| 都兰县| 海丰县| 随州市| 平度市| 肇源县| 阳西县|