How to Install Spark SQL Thrift Server (Hive) and connect it with Helical Insight

In this article, we will see how to install Spark SQL Thrift Server (Hive) and how to fetch data from spark thrift server in helical insight

Prerequisite: Helical Insight should be installed and running.
We will now see how to install Spark and then run hive thrift server. (Windows 64 Bit )

  1. Download spark from the link (http://d3kbcqa49mib13.cloudfront.net/spark-2.1.0-bin-hadoop2.7.tgz) and extract tgz file in ‘C:’ drive.
  2. Hadoop winutils is also needed to run Spark in windows. Download hadoop winutils from link(http://rwthaachen.dl.osdn.jp/win-hadoop/62852/hadoop-winutils-2.6.0.zip) and extract zip file in ‘C’ drive
  3. Now we will set the HADOOP_HOME environment variable
    1. Go to my computers
    2. Click on Properties
    3. Click on advanced system settings
    4. Click on Environment variables
    5. Click on new
    6. Give variable name : HADOOP_HOME
    7. variable value : location of hadoop Example : C:\hadoop
    8. Click on OK
    9. click on path
    10. click on Edit
    11. add C:\hadoop at end of the path
  4. You need to run the following command to make the C:\tmp\hive read write and executable
  5. Example: winutils.exe chmod -R 777 C:\tmp\hive

  6. Run command in administrator mode and change directory to spark-2.1.0-bin-hadoop2.7\ spark-2.1.0-bin-hadoop2.7\bin. Administrator mode is required while installation. Otherwise you may get the error.
  7. C:\> cd spark-2.1.0-bin-hadoop2.7\ spark-2.1.0-bin-hadoop2.7\bin

  8. Execute below command in command line Spark-submit –verbose –class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 –hiveconf hive.server2.thrift.port=10000 –driver-memory 512m –master=local
  9. Instead of 10000 give your hive port at highlighted portion

  10. Once you will get message HiveServer2 is started then execute beeline script which is present in spark-2.1.0-bin-hadoop2.7/bin folder
  11. Example: C:\spark-2.1.0-bin-hadoop2.7\spark-2.1.0-bin-hadoop2.7\bin>beeline
    Spark Installation

  12. Execute below command in command line for connecting to the server using JDBC url for your server
  13. !connect jdbc:hive2://localhost:10000
    Example: beeline > !connect jdbc:hive2://localhost:10000

  14. Enter username and password of hive
  15. Once you get a message Connected to: Spark SQL. Now you can run queries for creating table and insert values to table
Example: 0: jdbc:hive2://localhost:10000> CREATE DATABASE `sampletraveldata`;
	    0: jdbc:hive2://localhost:10000> USE `sampletraveldata`;
                 0: jdbc:hive2://localhost:10000> CREATE TABLE  `employee_details` (
                                               `employee_id` int, `employee_name` String);
                   0: jdbc:hive2://localhost:10000> INSERT INTO TABLE `employee_details` VALUES
	(1, 'Mike Cannon-Brookes');
                 0: jdbc:hive2://localhost:10000> select * from sampletraveldata;

We will now see how to fetch data from spark using below steps.

Below steps for create datasouce from spark hive thirft server in helical insight application:

  1. Go to Helical insight application -> Click on Default user or Default Admin
  2. Click on Datasource page and then look for hive datasource (using hive you can connect spark)
  3. Click on create option and enter your hive details as mentioned in below image then click on test connection if test connection is successful and then click on save datasource
  4. Click on create metadata action by using this datasource -> click on schema -> select tables and give right click -> click on Add to metadata -> Save metadata
  5. Click on Reports page -> click on connect to same metadata -> select same metadata -> give right click and click on use this metadata
  6. Drag column to selection area from selected table and generate report with available visualization types

This way helical insight BI tool is capable of connecting spark thrift server (hive) and fetching data from spark.

Helical Insight’s self-service capabilities is one to reckon with. It allows you to simply drag and drop columns, add filters, apply aggregate functions if required, and create reports and dashboards on the fly. For advanced users, the self-service component has ability to add javascript, HTML, HTML5, CSS, CSS3 and AJAX. These customizations allow you to create dynamic reports and dashboards. You can also add new charts inside the self-service component, add new kind of aggregate functions and customize it using our APIs.
Helical Insight’s self-service capabilities is one to reckon with. It allows you to simply drag and drop columns, add filters, apply aggregate functions if required, and create reports and dashboards on the fly. For advanced users, the self-service component has ability to add javascript, HTML, HTML5, CSS, CSS3 and AJAX. These customizations allow you to create dynamic reports and dashboards. You can also add new charts inside the self-service component, add new kind of aggregate functions and customize it using our APIs.
Helical Insight, via simple browser based interface of Canned Reporting module, also allows to create pixel perfect printer friendly document kind of reports also like Invoice, P&L Statement, Balance sheet etc.
Helical Insight, via simple browser based interface of Canned Reporting module, also allows to create pixel perfect printer friendly document kind of reports also like Invoice, P&L Statement, Balance sheet etc.
If you have a product, built on any platform like Dot Net or Java or PHP or Ruby, you can easily embed Helical Insight within it using iFrames or webservices, for quick value add through instant visualization of data.
If you have a product, built on any platform like Dot Net or Java or PHP or Ruby, you can easily embed Helical Insight within it using iFrames or webservices, for quick value add through instant visualization of data.
Being a 100% browser-based BI tool, you can connect with your database and analyse across any location and device. There is no need to download or install heavy memory-consuming developer tools – All you need is a Browser application! We are battle-tested on most of the commonly used browsers.
Being a 100% browser-based BI tool, you can connect with your database and analyse across any location and device. There is no need to download or install heavy memory-consuming developer tools – All you need is a Browser application! We are battle-tested on most of the commonly used browsers.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
A first-of-its-kind Open-Source BI framework, Helical Insight is completely API-driven. This allows you to add functionalities, including but not limited to adding a new exporting type, new datasource type, core functionality expansion, new charting in adhoc etc., at any place whenever you wish, using your own in-house developers.
A first-of-its-kind Open-Source BI framework, Helical Insight is completely API-driven. This allows you to add functionalities, including but not limited to adding a new exporting type, new datasource type, core functionality expansion, new charting in adhoc etc., at any place whenever you wish, using your own in-house developers.
It handles huge volumes of data effectively. Caching, Pagination, Load-Balancing and In-Memory not only provides you with amazing experience, but also and does not burden the database server more than required. Further effective use of computing power gives best performance and complex calculations even on the big data even with smaller machines for your personal use. Filtering, Sorting, Cube Analysis, Inter Panel Communication on the dashboards all at lightning speed. Thereby, making best open-source Business Intelligence solution in the market.
It handles huge volumes of data effectively. Caching, Pagination, Load-Balancing and In-Memory not only provides you with amazing experience, but also and does not burden the database server more than required. Further effective use of computing power gives best performance and complex calculations even on the big data even with smaller machines for your personal use. Filtering, Sorting, Cube Analysis, Inter Panel Communication on the dashboards all at lightning speed. Thereby, making best open-source Business Intelligence solution in the market.
With advance NLP algorithm, business users simply ask questions like, “show me sales of last quarter”, “average monthly sales of my products”. Let the application give the power to users without knowledge of query language or underlying data architecture
With advance NLP algorithm, business users simply ask questions like, “show me sales of last quarter”, “average monthly sales of my products”. Let the application give the power to users without knowledge of query language or underlying data architecture
Our application is compatible with almost all databases, be it RDBMS, or columnar database, or even flat files like spreadsheets or csv files. You can even connect to your own custom database via JDBC connection. Further, our database connection can be switched dynamically based on logged in users or its organization or other parameters. So, all your clients can use the same reports and dashboards without worrying about any data security breech.
Our application is compatible with almost all databases, be it RDBMS, or columnar database, or even flat files like spreadsheets or csv files. You can even connect to your own custom database via JDBC connection. Further, our database connection can be switched dynamically based on logged in users or its organization or other parameters. So, all your clients can use the same reports and dashboards without worrying about any data security breech.
Our application can be installed on an in-house server where you have full control of your data and its security. Or on cloud where it is accessible to larger audience without overheads and maintenance of the servers. One solution that works for all.
Our application can be installed on an in-house server where you have full control of your data and its security. Or on cloud where it is accessible to larger audience without overheads and maintenance of the servers. One solution that works for all.
Different companies have different business processes that the existing BI tools do not encompass. Helical Insight permits you to design your own workflows and specify what functional module of BI gets triggered
Different companies have different business processes that the existing BI tools do not encompass. Helical Insight permits you to design your own workflows and specify what functional module of BI gets triggered