Hadoop Tutorial | Spark Kafka Nosql and BigData tools for DWH

Posts

Showing posts from 2015

Run Linux shell script in every few minutes/hours/monthly ...

September 08, 2015

Q: How can I run Linux shell script in every n minute/hours/monthly A: By using while loop in script or 'CRON' 'CRON' is best Choice Cron is a daemon found on most Unix/linux systems that runs scheduled commands at the specified intervals. You add a script to the list by copying it to the folder of your choice: cron.daily cron.hourly cron.monthly cron.weekly These folders are typically found in /etc OR Just type below command on consol and editor will open, $crontab -e In this there ia a line like # * * * * * command Remove # from the begining of The line and set like you want for 10 minute */10 * * * * ./script for 2 hours 0 */2 * * * here ./script is may be linux command or a script. By using this we can RUN Hadoop/Hive/PIG scripts on every Interval which we want. =================xxxxxxxxxxxxxxxx============================ My Scripts ============================================================= script.sh file in Home d...

MongoDB Replication Configuration

January 14, 2015

MongoDB Replication Configuration: while replication in mongodb if it show error like "not master" then run rs.slaveOk() command on secondary node side it will solve issue. ==================================================== Replication- mongoDB 1.Start by creating a data directory for each replica set member: mkdir /data/node1 mkdir /data/node2 mkdir /data/arbiter 2. mongod --replSet myapp --dbpath /data/node1 --port 40000 mongod --replSet myapp --dbpath /data/node2 --port 40001 mongod --replSet myapp --dbpath /data/arbiter --port 40002 3. run mongo hostname:40000 to rum client on primary, and then run the rs.initiate() command: > rs.initiate() { "info2" : "no configuration explicitly specified -- making one", "me" : "arete:40000", "info" : "Config now saved locally. Should come online in about a minute .", "ok" : 1 } 4.You can now add the o...

What is MongoDB

January 14, 2015

MongoDB - Document Oriented NoSQL DataBase: MongoDB is one of several database types to arise in the mid-2000s under the NoSQL banner. Instead of using tables and rows as in relational databases , MongoDB is built on an architecture of collections and documents. Documents comprise sets of key-value pairs and are the basic unit of data in MongoDB. Collections contain sets of documents and function as the equivalent of relational database tables. Like other NoSQL databases, MongoDB supports dynamic schema design, allowing the documents in a collection to have different fields and structures. The database uses a document storage and data interchange format called BSON, which provides a binary representation of JSON -like documents. Automatic sharding enables data in a collection to be distributed across multiple systems for horizontal scalability as data volumes ...

hadoop learning blogs | Bigdata blogs

January 11, 2015

100's of blogs Related to BigData: http://blogs.the451group.com/opensource/ href="http://abeautifulwww.com" href="http://timmanns.blogspot.com/" href="http://www.behind-the-enemy-lines.com/" href="http://www.acthomas.ca/" href="http://abbottanalytics.blogspot.com/" href="http://www.advancednflstats.com" href="http://patilv.github.io/" href="http://blog.smola.org/" href="http://blog.markus-breitenbach.com/" href="http://allthingsdistributed.com" href="http://aws.typepad.com/aws" href="http://www.analyticbridge.com/profiles/blog/list" href="http://www.analyticsvidhya.com/blog/" href="http://www.applieddatalabs.com" href="http://atbrox.com" href="http://www.bitquill.net/blog" href="http://radar.oreilly.com/ben/" href="http://benfry.com/writing/" href="htt...

Read XML file in hadoop hive

January 11, 2015

Process xml file on hadoop OR Load xml file data in Hive/HBase: My program code: Here I have XmlDriver class, in that i have xmlRecodReader class, map class & reduce class and it will process xml data and generate comma seprated columns in hdfs. Raw data file like: <current_observation version=\"1.0"....> <latitude>1</latitude> <longitude>2</longitude> <pressure_mb>3</pressure_mb> <wind_mph>4</wind_mph> <wind_dir>5</wind_dir> <windchill_c>6</windchill_c> <temp_c>7</temp_c> <wind_degrees>8</wind_degrees> </current_observation> import java.io.ByteArrayInputStream; import java.io.IOException; import javax.xml.stream.XMLInputFactory; import javax.xml.stream.XMLStreamConstants; import javax.xml.stream.XMLStreamReader; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputS...