- Artificial Intelligence for Big Data
- Anand Deshpande Manish Kumar
- 445字
- 2021-06-25 21:57:11
Ontology learning
With the basic concepts on Ontologies covered in this chapter, along with their significance in building intelligent systems, it is imperative that for a seamlessly connected world, the knowledge assets are consistently represented as domain Ontologies. However, the process of manually creating domain-specific Ontologies requires lots of manual effort, validation, and approval. Ontology learning is an attempt to automate the process of the generation of Ontologies, using an algorithmic approach on the natural language text, which is available at the internet scale. There are various approaches to Ontology learning, as follows:
- Ontology learning from text: In this approach, the textual data is extracted from various sources in an automated manner, and keywords are extracted and classified based on their occurrence, word sequencing, and patterns.
- Linked data mining: In this processes, the links are identified in the published RDF graphs in order to derive Ontologies based on implicit reasoning.
- Concept learning from OWL: In this approach, existing domain-specific Ontologies are leveraged for expand the new domains using an algorithmic approach.
- Crowdsourcing: This approach combines automated Ontology extraction and discovery based on textual analysis and collaboration with domain experts to define new Ontologies. This approach works great since it combines the processing power and algorithmic approaches of machines and the domain expertise of people. This results in improved speed and accuracy.
Here are some of the challenges of Ontology learning:
- Dealing with heterogeneous data sources: The data sources on the internet, and within application stores, differ in their forms and representations. Ontology learning faces the challenge of knowledge extraction and consistent meaning extraction due to the heterogeneous nature of the data sources.
- Uncertainty and lack of accuracy: Due the the inconsistent data sources, when Ontology learning attempts to define Ontology structures, there is a level of uncertainty in terms of the intent and representation of entities and attributes. This results in a lower level of accuracy and requires human intervention from domain experts for realignment.
- Scalability: One of the primary sources for Ontology learning is the internet, which is an ever growing knowledge repository. The internet is also an unstructured data source for the most part and this makes it difficult to scale the Ontology learning process to cover the width of the domain from large text extracts. One of the ways to address scalability is to leverage new, open source, distributed computing frameworks (such as Hadoop).
- Need for post-processing: While Ontology learning is intended to be an automated process, in order to overcome quality issues, we require a level of post-processing. This process need to be planned and governed in detail in order to optimize the speed and accuracy of new Ontology definitions.
推薦閱讀
- 數據存儲架構與技術
- 數據要素安全流通
- PyTorch深度學習實戰:從新手小白到數據科學家
- 數據分析實戰:基于EXCEL和SPSS系列工具的實踐
- Google Visualization API Essentials
- DB29forLinux,UNIX,Windows數據庫管理認證指南
- SQL Server 2008數據庫應用技術(第二版)
- 文本數據挖掘:基于R語言
- Live Longer with AI
- 大數據:從概念到運營
- Oracle高性能自動化運維
- Instant Autodesk AutoCAD 2014 Customization with .NET
- 大數據分析:數據倉庫項目實戰
- Doris實時數倉實戰
- Web Services Testing with soapUI