官术网_书友最值得收藏!

Splitting a string on lines, words, or arbitrary tokens

Useful data is often interspersed between delimiters, such as commas or spaces, making string splitting vital for most data analysis tasks.

Getting ready

Create an input.txt file similar to the following one:

$ cat input.txt

first line
second line
words are split by space
comma,separated,values
or any delimiter you want

Install the split package using Cabal as follows:

$ cabal install split

How to do it...

  1. The only function we will need is splitOn, which is imported as follows:
    import Data.List.Split (splitOn)
  2. First we split the string into lines, as shown in the following code snippet:
    main = do 
      input <- readFile "input.txt"
      let ls = lines input
      print $ ls
  3. The lines are printed in a list as follows:
    [ "first line","second line"
    , "words are split by space"
    , "comma,separated,values"
    , "or any delimiter you want"]
    
  4. Next, we separate a string on spaces as follows:
      let ws = words $ ls !! 2
      print ws
  5. The words are printed in a list as follows:
    ["words","are","split","by","space"]
    
  6. Next, we show how to split a string on an arbitrary value using the following lines of code:
      let cs = splitOn "," $ ls !! 3
      print cs
  7. The values are split on the commas as follows:
    ["comma","separated","values"]
    
  8. Finally, we show splitting on multiple letters as shown in the following code snippet:
      let ds = splitOn "an" $ ls !! 4
      print ds
  9. The output is as follows:
    ["or any d","limit","r you want"]
    
主站蜘蛛池模板: 灵武市| 龙海市| 仪陇县| 娄烦县| 聂拉木县| 房产| 田阳县| 平度市| 海淀区| 汕尾市| 东乌珠穆沁旗| 天气| 日照市| 伊金霍洛旗| 山阳县| 诏安县| 克山县| 许昌市| 观塘区| 内乡县| 桂东县| 平南县| 梅州市| 舞阳县| 睢宁县| 图们市| 新沂市| 湛江市| 佛山市| 揭阳市| 博白县| 靖州| 循化| 乡城县| 嫩江县| 辽阳市| 兴化市| 黎城县| 股票| 遂溪县| 乌拉特中旗|