官术网_书友最值得收藏!

  • Bash Cookbook
  • Ron Brash Ganesh Naik
  • 159字
  • 2021-07-23 19:17:39

Calculating statistics and reducing duplicates based on file contents

At first glance, calculating statistics based on the contents of a file might not be among the most interesting tasks one could accomplish with Bash scripting, however, it can be useful in several circumstances. Let's imagine that our program takes user input from several commands. We could calculate the length of the input to determine if it is too little or too much. Alternatively, we could also determine the size of a string to determine buffer sizes for a program written in another programming language (such as C/C++):

$ wc -c <<< "1234567890"
11 # Note there are 10 chars + a new line or carriage return \n
$ echo -n "1234567890" | wc -c
10
We can use commands like  wc to calculate the number of occurrences of words, total number of lines, and many other actions in conjunction to the functionality provided by your script.

Better yet, what if we used a command called strings to output all printable ASCII strings to a file? The strings program will output every occurrence of a stringeven if there are duplicates. Using other programs like sort and uniq (or a combination of the two), we can also sort the contents of a file and reduce duplicates if we wanted to calculate the number of unique lines within a file:

$ strings /bin/ls > unalteredoutput.txt
$ ls -lah unalteredoutput.txt
-rw-rw-r-- 1 rbrash rbrash 22K Nov 24 11:17 unalteredoutput.txt
$ strings /bin/ls | sort -u > sortedoutput.txt
$ ls -lah sortedoutput.txt
-rw-rw-r-- 1 rbrash rbrash 19K Nov 24 11:17 usortedoutput.txt

Now that we know a few basic premises of why we may need to perform some basic statistics, let's carry on with the recipe.

主站蜘蛛池模板: 于田县| 兖州市| 阿拉善左旗| 大渡口区| 湖口县| 合作市| 和龙市| 白河县| 安徽省| 措美县| 稷山县| 和平县| 富蕴县| 河曲县| 星座| 大田县| 湘阴县| 普安县| 鄯善县| 外汇| 互助| 北京市| 友谊县| 射阳县| 海城市| 伊金霍洛旗| 芜湖县| 石林| 永川市| 巴林左旗| 丽江市| 南丰县| 三河市| 上高县| 清原| 荔波县| 沭阳县| 安达市| 东源县| 贵南县| 凌云县|