- Hands-On Unsupervised Learning with Python
- Giuseppe Bonaccorso
- 373字
- 2021-07-02 12:32:00
Diagnostic analysis
Till now, we have worked with output data, which has been observed after a specific underlying process has generated it. The natural question after having described the system relates to the causes. Temperature depends on many meteorological and geographical factors, which can be either easily observable or completely hidden. Seasonality in the time series is clearly influenced by the period of the year, but what about the outliers?
For example, we have discovered a peak in a region identified as winter. How can we justify it? In a simplistic approach, this can be considered as a noisy outlier that can be filtered out. However, if it has been observed and there's a ground truth behind the measure (for example, all the parties agree that it's not an error), we should assume the presence of a hidden (or latent) cause.
It can be surprising, but the majority of more complex scenarios are characterized by a huge number of latent causes (sometimes called factors) that are too difficult to analyze. In general, this is not a bad condition but, as we're going to discuss, it's important to include them in the model to learn their influence through the dataset.
On the other hand, deciding to drop all unknown elements means reducing the predictive ability of the model with a proportional loss of accuracy. Therefore, the primary goal of diagnostic analysis is not necessarily to find out all the causes but to list the observable and measurable elements (known as factors), together with all the potential latent ones (which are generally summarized into a single global element).
To a certain extent, a diagnostic analysis is often similar to a reverse-engineering process, because we can easily monitor the effects, but it's more difficult to detect existing relationships between potential causes and observable effects. For this reason, such an analysis is often probabilistic and helps find the probability that a certain identified cause brings about a specific effect. In this way, it's also easier to exclude non-influencing elements and to determine relationships that were initially excluded. However, this process requires a deeper knowledge of statistical learning methods and it won't be discussed in this book, apart from a few examples, such as a Gaussian mixture.
- 筆記本電腦使用、維護與故障排除實戰
- Istio入門與實戰
- Python GUI Programming:A Complete Reference Guide
- 電腦維護與故障排除傻瓜書(Windows 10適用)
- 深入淺出SSD:固態存儲核心技術、原理與實戰
- 嵌入式技術基礎與實踐(第5版)
- Getting Started with Qt 5
- The Deep Learning with Keras Workshop
- 基于Apache Kylin構建大數據分析平臺
- 計算機組裝維修與外設配置(高等職業院校教改示范教材·計算機系列)
- Machine Learning Solutions
- 微型計算機系統原理及應用:國產龍芯處理器的軟件和硬件集成(基礎篇)
- 微服務實戰(Dubbox +Spring Boot+Docker)
- 筆記本電腦維修技能實訓
- The Applied Artificial Intelligence Workshop