Posts

Showing posts from January, 2017

Installation of Presto DB & Client Connection with Presto

Installation of Presto DB & Client Connection with Presto As we Already discussed about Presto DB that it is a distributed analytical query engine to run sql kind of query on data warehouse. So lets see the installation of Presto DB. Single node Presto DB Installation:  Here we will install Presto DB on single node Linux machine https://prestodb.io/docs/current/installation/deployment.html Multi node Presto DB Installation: Here we will install Presto DB on Three node Linux machine or the same can be install on existing Hadoop Cluster to run query on hive data. https://prestodb.io/docs/current/installation/deployment.html Client Connection with Presto: Presto DB client can be downloaded from Presto DB site: https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.161/presto-cli-0.161-executable.jar

PrestoDb: A open source distributed SQL query engine

Presto DB Power full Query Engine Presto DB is an open source distributed query engine to run interactive SQL(analytics query) on Big-Data which can be gigabytes to terabytes or petabytes.  Presto was designed for interactive analytics and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook. Presto allows to querying data from Hadoop HDFS, Hive, Cassandra, relational databases or even proprietary data stores. A single presto query can combine data from multiple sources. The main goal of Presto to deliver analytics query result in sub-seconds to minutes on non-expensive hardware like hadoop cluster. It's fully free. Facebook uses Presto for interactive query against several internal data stores, including their 300 PB data warehouse. I personally tried presto DB on 3 node cluster with the data size of 1 TB to 3 TB data which resides on Hadoop HDFS and got the awesome performance in sub-seconds(calculations)