官术网_书友最值得收藏!

Introduction

Indexing data is one of the most crucial things in Lucene and Solr deployment. When your data is not indexed properly, your search results will be poor. When the search results are poor, it's almost certain the users will not be satisfied with the application that uses Solr. This is why we need our data to be prepared and indexed as timely and correctly as possible.

On the other hand, preparing data is not an easy task. Nowadays, we have more and more data floating around. We need to index multiple formats of data from multiple sources. Do we need to parse the data manually and prepare the data in XML format? The answer is no; we can let Solr do this for us. This chapter will concentrate on the indexing process and data preparation, starting with how to index data that is a binary PDF file to how to use Data Import Handler to fetch data from database and index it with Apache Solr and describing how we can detect the document language during indexation. We will also learn how to modify the data during indexation so that we don't have to prepare everything upfront.

主站蜘蛛池模板: 平武县| 道真| 杂多县| 交城县| 开原市| 黔西| 囊谦县| 明水县| 江北区| 南投县| 林州市| 桂平市| 新闻| 高雄市| 海原县| 蒲城县| 五常市| 新龙县| 南靖县| 安义县| 永嘉县| 和平县| 长丰县| 玉树县| 乌海市| 垫江县| 西乌| 南陵县| 沂水县| 大理市| 肃宁县| 五原县| 成安县| 泊头市| 正镶白旗| 安国市| 康定县| 孝感市| 烟台市| 绥宁县| 海门市|