官术网_书友最值得收藏!

Modelling and analysis

This part of the project might be the most creative one, since it includes numerous tasks, which have to be taken to deliver the final product. The list of tasks can be very long, and may include these:

  • Data mining
  • Text analytics
  • Model building
  • Feature engineering and extraction
  • Model testing

Microsoft SQL Server has tools built in, which can provide a delivery platform for most of the tasks. When we talk about data mining, there are several different methodologies or frameworks to follow, where so far the Cross Industry Standard Process for Data Mining (CRISP-DM) is the most frequently used one, based on several different methods of research regarding the methodology usage. In 2015, IBM released a new methodology called Analytics Solutions Unified Method for Data Mining/Predictive Analytics, which refined and extended CRISP-DM. CRISP-DM is an open-standard process model that describes common approaches used by data-mining experts, and it's still the most widely used analytics model. CRISP-DM breaks the process of data mining into six major phases. The sequence of the phases is not strict and moves back and forth between different phases, as it is always required. The arrows in the process diagram indicate the most important and frequent dependencies between phases. The outer circle in the diagram symbolizes the cyclic nature of data mining itself. A data-mining process continues after a solution has been deployed. The lessons learned during the process can trigger new, often more focused business questions, and subsequent data-mining processes will benefit from the experiences of the previous ones:

The purpose of data mining is to put structured and unstructured data in relation to each other so as to easily interface them and provide the workers in the sector with a system that is easy to use. The experts of each specified area of business will therefore have access to a complex data system that is able to process information at different levels. This has the advantage of bringing to light the relationships among data, predictive analysis, assessments for specific business decisions, and much more.

Data mining can be used for solving many business problems and to prepare the data for a more advanced approach, such as machine learning, which can be used for:

  • Searching for anomalies
  • Churn analysis
  • Customer segmentation
  • Forecasting
  • Market basket analysis
  • Network intrusion detection
  • Targeted advertisement
主站蜘蛛池模板: 仙游县| 同江市| 临潭县| 博爱县| 海安县| 虎林市| 康平县| 黄平县| 黔南| 闸北区| 普兰县| 平安县| 房山区| 泰宁县| 石景山区| 蒙阴县| 河池市| 嘉定区| 阿坝| 宁强县| 湖南省| 泾源县| 武城县| 泸定县| 蕲春县| 四子王旗| 安达市| 通榆县| 辉县市| 微博| 舒城县| 阿图什市| 金堂县| 崇礼县| 屏山县| 南投市| 阳春市| 邻水| 西青区| 德保县| 尉氏县|