官术网_书友最值得收藏!

Examining, cleaning, and filtering data

The next steps after importing the data are to examine it and check for missing or erroneous data. We then need to clean the data and apply filters and selections. Different kinds of datasets need different approaches to carry out these steps. R has powerful packages to handle this and some of them are as follows:

  • dplyrdplyr is a powerful R package that provides methods to make examining, cleaning, and filtering data fast and easy.
  • tidyr: The tidyr package helps to organize messy data for easier data analysis.
  • stringr: The stringr package provides methods and techniques of working with string data efficiently.
  • forcats: Factors are widely used while doing data analysis in R. The forcats package makes it easy to work with factors.
  • lubridate: lubridate makes wrangling date-time data quick and easy.
  • hms: hms is a great package for handling datasets that include data with time of day values.
  • blob: Not all data always comes stored in plain ASCII text; you sometimes have to deal with binary data formats. The blob package makes this easy.
主站蜘蛛池模板: 廊坊市| 泽库县| 永昌县| 额敏县| 邯郸县| 东兰县| 广德县| 高阳县| 乳源| 明光市| 辛集市| 延津县| 洪湖市| 沿河| 海城市| 辽宁省| 宜阳县| 芦山县| 辽宁省| 扶绥县| 昌都县| 阿克苏市| 云林县| 开平市| 余江县| 名山县| 绍兴市| 凌海市| 甘南县| 依安县| 武汉市| 龙州县| 泰顺县| 合江县| 六安市| 肥西县| 全椒县| 青岛市| 祁东县| 仲巴县| 陇西县|