Apache Solr Installation
Solr Installation and New core Configuration:
Apache Solr achieve fast search responses because, instead of searching the text directly, it searches an index. This is like retrieving pages in a book based on a word by scanning the index at the back of a book, as opposed to searching every word of every page of the book. This type of index is called an inverted index, because it inverts a page-centric data structure (page->words) to a keyword-centric data structure (word->pages). Solr stores this index in a directory called index in the data directory.
Apache Solr is powered by Lucene, a powerful open-source full-text search library, under the hood. The relationship between Solr and Lucene, is like that of the relationship between a car and engine.
In Solr, a Document is the unit of search and index. An index consists of one or more Documents, and a Document consists of one or more Fields.
Solr Installation and New core Configuration:
1. Installation-
Download solr version & then
extract solr-version
go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/
2. New core Configuration-
solr-version/example/solr/mkdir coll1
inside coll1
mkdir conf
inside conf make schema.xml with following content:
<?xml version="1.0" encoding="UTF-8" ?>
<schema version="1.5">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="addr_from" type="string" indexed="true" stored="true" required="true"/>
<field name="addr_to" type="string" indexed="true" stored="true" required="true"/>
<field name="subject" type="string" indexed="true" stored="true" required="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<types>
<fieldType name="string" class="solr.StrField" />
</types>
</schema>
inside conf make solrconfig.xml with following content:
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>LUCENE_43</luceneMatchVersion>
<requestDispatcher handleSelect="false">
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler" />
<requestHandler name="/update" class="solr.UpdateRequestHandler" />
<requestHandler name="/admin" class="solr.admin.AdminHandlers" />
<requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" />
</config>
come back in coll1 dir
make core.properties here with following content:
#Written by CorePropertiesLocator
#Wed May 29 16:26:21 IST 2014
name=coll1
config=solrconfig.xml
schema=schema.xml
dataDir=data
make core.properties.unloaded here with following content:
coll1
then start solr: go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/
and if in your coll1 dir data dir will be made automatically beside conf. means your core is created.
import csv file in newly created core in solr:
create input1.csv file inside /solr-version/example/exampledocs/
where post.jar file is also available then run following cmd:
[user@localhost exampledocs]$ java -Dauto -Durl=http://localhost:8983/solr/collection1/update -jar post.jar input1.csv
you will like this:
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/collection1/update..
Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file input1.csv (text/csv)
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/collection1/update..
Time spent: 0:00:00.450
Apache Solr achieve fast search responses because, instead of searching the text directly, it searches an index. This is like retrieving pages in a book based on a word by scanning the index at the back of a book, as opposed to searching every word of every page of the book. This type of index is called an inverted index, because it inverts a page-centric data structure (page->words) to a keyword-centric data structure (word->pages). Solr stores this index in a directory called index in the data directory.
Apache Solr is powered by Lucene, a powerful open-source full-text search library, under the hood. The relationship between Solr and Lucene, is like that of the relationship between a car and engine.
In Solr, a Document is the unit of search and index. An index consists of one or more Documents, and a Document consists of one or more Fields.
Solr Installation and New core Configuration:
1. Installation-
Download solr version & then
extract solr-version
go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/
2. New core Configuration-
solr-version/example/solr/mkdir coll1
inside coll1
mkdir conf
inside conf make schema.xml with following content:
<?xml version="1.0" encoding="UTF-8" ?>
<schema version="1.5">
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="addr_from" type="string" indexed="true" stored="true" required="true"/>
<field name="addr_to" type="string" indexed="true" stored="true" required="true"/>
<field name="subject" type="string" indexed="true" stored="true" required="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<types>
<fieldType name="string" class="solr.StrField" />
</types>
</schema>
inside conf make solrconfig.xml with following content:
<?xml version="1.0" encoding="UTF-8" ?>
<config>
<luceneMatchVersion>LUCENE_43</luceneMatchVersion>
<requestDispatcher handleSelect="false">
<httpCaching never304="true" />
</requestDispatcher>
<requestHandler name="/select" class="solr.SearchHandler" />
<requestHandler name="/update" class="solr.UpdateRequestHandler" />
<requestHandler name="/admin" class="solr.admin.AdminHandlers" />
<requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" />
</config>
come back in coll1 dir
make core.properties here with following content:
#Written by CorePropertiesLocator
#Wed May 29 16:26:21 IST 2014
name=coll1
config=solrconfig.xml
schema=schema.xml
dataDir=data
make core.properties.unloaded here with following content:
coll1
then start solr: go inside solr-version/example/ and run command: java -jar start.jar
see this will show solr screen in browser by following url
http://localhost:8983/solr/
and if in your coll1 dir data dir will be made automatically beside conf. means your core is created.
import csv file in newly created core in solr:
create input1.csv file inside /solr-version/example/exampledocs/
where post.jar file is also available then run following cmd:
[user@localhost exampledocs]$ java -Dauto -Durl=http://localhost:8983/solr/collection1/update -jar post.jar input1.csv
you will like this:
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/collection1/update..
Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
POSTing file input1.csv (text/csv)
1 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/collection1/update..
Time spent: 0:00:00.450
Excellent Blog, I like your blog and It is very informative. Thank you
ReplyDeletePyspark online Training
Learn Pyspark Online
Really informative Blog...Thanks for sharing...Waiting for next update...
ReplyDeleteStruts Training in Chennai
Struts Training center in Chennai