官术网_书友最值得收藏!

Lexing and parsing an e-mail address

An elegant way to clean data is by defining a lexer to split up a string into tokens. In this recipe, we will parse an e-mail address using the attoparsec library. This will naturally allow us to ignore the surrounding whitespace.

Getting ready

Import the attoparsec parser combinator library:

$ cabal install attoparsec

How to do it…

Create a new file, which we will call Main.hs, and perform the following steps:

  1. Use the GHC OverloadedStrings language extension to more legibly use the Text data type throughout the code. Also, import the other relevant libraries:
    {-# LANGUAGE OverloadedStrings #-}
    import Data.Attoparsec.Text
    import Data.Char (isSpace, isAlphaNum)
  2. Declare a data type for an e-mail address:
    data E-mail = E-mail 
      { user :: String
      , host :: String
      } deriving Show
  3. Define how to parse an e-mail address. This function can be as simple or as complicated as required:
    e-mail :: Parser E-mail
    e-mail = do
      skipSpace
      user <- many' $ satisfy isAlphaNum
      at <- char '@'
      hostName <- many' $ satisfy isAlphaNum
      period <- char '.'
      domain <- many' (satisfy isAlphaNum)
      return $ E-mail user (hostName ++ "." ++ domain)
  4. Parse an e-mail address to test the code:
    main :: IO ()
    main = print $ parseOnly e-mail "nishant@shukla.io"
  5. Run the code to print out the parsed e-mail address:
    $ runhaskell Main.hs
    
    Right (E-mail {user = "nishant", host = "shukla.io"})
    

How it works…

We create an e-mail parser by matching the string against multiple tests. An e-mail address must contain some alphanumerical username, followed by the 'at' sign (@), then an alphanumerical hostname, a period, and lastly the top-level domain.

The various functions used from the attoparsec library can be found in the Data.Attoparsec.Text documentation, which is available at https://hackage.haskell.org/package/attoparsec/docs/Data-Attoparsec-Text.html.

主站蜘蛛池模板: 夏邑县| 南宫市| 五家渠市| 合水县| 湖南省| 平定县| 游戏| 连云港市| 望江县| 荔浦县| 容城县| 洛南县| 汤原县| 铜陵市| 宁德市| 黄龙县| 新竹市| 吐鲁番市| 平塘县| 焦作市| 石狮市| 道真| 惠水县| 绥滨县| 玉田县| 蒲江县| 屏东市| 镇安县| 视频| 罗田县| 融水| 禹城市| 德令哈市| 贺兰县| 丘北县| 天津市| 仲巴县| 健康| 都兰县| 得荣县| 武陟县|