官术网_书友最值得收藏!

How it works...

We start by importing the PyGitHub library in Step 1 in order to be able to conveniently call the GitHub APIs. These will allow us to scrape and explore the universe of repositories. We also import the base64 module for decoding the base64 encoded files that we will be downloading from GitHub. Note that there is a rate limit on the number of API calls a generic user can make to GitHub. For this reason, you will find that if you attempt to download too many files in a short duration, your script will not get all of the files. Our next step is to supply our credentials to GitHub (step 2), and specify that we are looking for repositories with JavaScript, using the query='language:javascript' command. We enumerate such repositories matching our criteria of being associated with JavaScript, and if they do, we search through these for files ending with .js and create local copies (steps 3 to 6). Since these files are encoded in base64, we make sure to decode them to plaintext in step 7. Finally, we show you how to adjust the script in order to scrape other file types, such as Python and PowerShell (Step 8).

主站蜘蛛池模板: 定陶县| 沿河| 平陆县| 巴楚县| 玉林市| 林口县| 观塘区| 涞水县| 芮城县| 中阳县| 六枝特区| 张家川| 乌兰县| 江油市| 鄢陵县| 荔浦县| 拉萨市| 常州市| 明光市| 临夏县| 贵港市| 怀远县| 福贡县| 盖州市| 四会市| 绵竹市| 西昌市| 高唐县| 秦皇岛市| 霍州市| 古丈县| 汝城县| 合江县| 鄄城县| 康马县| 孝义市| 嘉善县| 汝南县| 东丰县| 河曲县| 伊金霍洛旗|