- Machine Learning with the Elastic Stack
- Rich Collier Bahaaldine Azarmi
- 479字
- 2021-07-02 13:48:14
The plethora of data
IT departments have invested in monitoring tools for decades and it is not uncommon to have a dozen or more tools actively collecting and archiving data that can be measured in terabytes, or even petabytes, per day. The data can range from rudimentary infrastructure- and network-level data to deep diagnostic data and/or system and application log files. Business-level key performance indicators (KPIs) could also be tracked, sometimes including data about the end user's experience. The sheer depth and breadth of data available, in some ways, is the most comprehensive that it has ever been.
To detect emerging problems or threats hidden in that data, there have traditionally been several main approaches to distilling the data into informational insights:
- Filter/search: Some tools allow the user to define searches to help trim down the data into a more manageable set. While extremely useful, this capability is most often used in an ad hoc fashion once a problem is suspected. Even then, the success of using this approach usually hinges on the ability for the user to know what they are looking for and their level of experience—both with prior knowledge of living through similar past situations and expertise in the search technology itself.
- Visualizations: Dashboards, charts, and widgets are also extremely useful to help us understand what data has been doing and where it is trending. However, visualizations are passive and require being watched for meaningful deviations to be detected. Once the number of metrics being collected and plotted surpasses the number of eyeballs available to watch them (or even the screen real estate to display them), visual-only analysis becomes less and less useful.
- Thresholds/rules: To get around the requirement of having data be physically watched in order for it to be proactive, many tools allow the user to define rules or conditions that get triggered upon known conditions or known dependencies between items. However, it is unlikely that you can realistically define all appropriate operating ranges or model all of the actual dependencies in today's complex and distributed applications. Plus, the amount and velocity of changes in the application or environment could quickly render any static rule set useless. Analysts found themselves chasing down many false positive alerts, setting up a boy who cried wolf paradigm that led to resentment of the tools generating the alerts and skepticism to the value that alerting could provide.
Ultimately, there needed to be a different approach—one that wasn't necessarily a complete repudiation of past techniques, but one that could bring a level of automation and empirical augmentation of the evaluation of data in a meaningful way. Let's face it, humans are imperfect—we have hidden biases, limitations of capacity for remembering information, and we are easily distracted and fatigued. Algorithms, if done correctly, can easily make up for these shortcomings.
- 我的J2EE成功之路
- Div+CSS 3.0網頁布局案例精粹
- 精通MATLAB神經網絡
- 輕松學C#
- Linux Mint System Administrator’s Beginner's Guide
- 大數據專業英語
- 影視后期制作(Avid Media Composer 5.0)
- 大數據技術入門(第2版)
- 自主研拋機器人技術
- JavaScript典型應用與最佳實踐
- Enterprise PowerShell Scripting Bootcamp
- 工業機器人運動仿真編程實踐:基于Android和OpenGL
- 單片機C語言應用100例
- SAP Business Intelligence Quick Start Guide
- Godot Engine Game Development Projects