Apache Solr Installation

   Solr Installation and New core Configuration:


Apache Solr achieve fast search responses because, instead of searching the text directly, it searches an index. This is like retrieving pages in a book based on a word by scanning the index at the back of a book, as opposed to searching every word of every page of the book. This type of index is called an inverted index, because it inverts a page-centric data structure (page->words) to a keyword-centric data structure (word->pages). Solr stores this index in a directory called index in the data directory.

Apache Solr is powered by Lucene, a powerful open-source full-text search library, under the hood. The relationship between Solr and Lucene, is like that of the relationship between a car and engine.

In Solr, a Document is the unit of search and index. An index consists of one or more Documents, and a Document consists of one or more Fields.


Solr Installation and New core Configuration:

1. Installation-
Download solr version & then
extract solr-version
go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/


2. New core Configuration-

solr-version/example/solr/mkdir coll1
inside coll1
mkdir conf
inside conf make schema.xml with following content:

<?xml version="1.0" encoding="UTF-8" ?>
<schema version="1.5">
  <fields>
    <field name="id" type="string" indexed="true" stored="true" required="true"/>
    <field name="addr_from" type="string" indexed="true" stored="true" required="true"/>
    <field name="addr_to" type="string" indexed="true" stored="true" required="true"/>
    <field name="subject" type="string" indexed="true" stored="true" required="true"/>
  </fields>
  <uniqueKey>id</uniqueKey>
  <types>
    <fieldType name="string" class="solr.StrField" />
  </types>
</schema>

inside conf make solrconfig.xml with following content:

<?xml version="1.0" encoding="UTF-8" ?>
<config>
  <luceneMatchVersion>LUCENE_43</luceneMatchVersion>
  <requestDispatcher handleSelect="false">
    <httpCaching never304="true" />
  </requestDispatcher>
  <requestHandler name="/select" class="solr.SearchHandler" />
  <requestHandler name="/update" class="solr.UpdateRequestHandler" />
  <requestHandler name="/admin" class="solr.admin.AdminHandlers" />
  <requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" />
</config>

come back in coll1 dir
make core.properties here with following content:

#Written by CorePropertiesLocator
#Wed May 29 16:26:21 IST 2014
name=coll1
config=solrconfig.xml
schema=schema.xml
dataDir=data

make core.properties.unloaded here with following content:
coll1

then start solr: go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/

and if in your coll1 dir data dir will be made automatically beside conf. means your core is created.

import csv file in newly created core in solr:
create input1.csv file inside /solr-version/example/exampledocs/
where post.jar file is also available then run following cmd:

[user@localhost exampledocs]$ java -Dauto -Durl=http://localhost:8983/solr/collection1/update -jar post.jar input1.csv

you will like this:
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/collection1/update..
Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file input1.csv (text/csv)
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/collection1/update..
Time spent: 0:00:00.450

Comments

Post a Comment

Popular posts from this blog

Setup Nginx as a Reverse Proxy for Thingsboard running on different port/server

How to auto re-launch a YARN Application Master on a failure.

Hive partitioned tables Issue with schema & PrestoDB