Approach
Helical Insight(BI) -> Fire SQL Query -> Map Cassandra with hive (one-time activity) -> Fetch data from Cassandra Just in Time
This approach involves creation of mapping tables in Hive meta-store of SPARK which have one-to-one mapping to Cassandra DB tables. Helical Insight can connect to this Hive meta-store using SPARK’s Thrift Server which allows JDBC connections to the same. SQL queries can be fired through this connection to fetch the data required for the reports.
Configure Spark to connect to Cassandra
Go to the spark default config file and add the cassendra connection details which are provided below.
Path: E:\Helical\SPARK\spark-2.0.2-bin-hadoop2.7\conf\spark-defaults.conf
Note: In your case sapark installation directory may change.
Add below lines:
spark.cassandra.connection.host <cassandra host ip>
spark.cassandra.connection.port <cassandra port>
spark.cassandra.auth.username <cassandra login username>
spark.cassandra.auth.password <cassandra login password in plain text>
Configure Spark Thrift Server
1. Starting the Thrift Server with Datastax Spark Cassandra Connector in its class path
Sample command :
<spark-home>:\bin>:spark-submit --verbose --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --hiveconf hive.server2.thrift.port=20000 --jars \spark-cassandra-connector-2.0.0-M2-s_2.11.jar --driver-memory 512m
2. Create mapping tables in SPARK’s Hive Metastore
a. Connect to Spark SQL Thrift Server using beeline / any other tool
./bin/beeline -u jdbc:hive2://
b.Hive Database creation by running below database creation commands.
Here we are creating TravelData database and using it.
1) Create database TravelData;
2) Use TravelData;
Note: Database name in your case is different
c. Hive Tables Creation and mapping Cassandra tables to Hive Tables
Here we are creating the 3 tables employee_details, meeting_details and traveldetails.
employee_details table creation:
create table employee_details using org.apache.spark.sql.cassandra options (cluster ‘Test Cluster’, keyspace ‘TravelData’, table ’employee_details’);
meeting_details table creation:
create table meeting_details using org.apache.spark.sql.cassandra options (cluster ‘Test Cluster’, keyspace ‘TravelData’, table ‘meeting_details’);
traveldetails table creation:
create table traveldetails using org.apache.spark.sql.cassandra options (cluster ‘Test Cluster’, keyspace ‘TravelData’, table ‘traveldetails’);
Note: Tables names in your case are different.
Create a data source in Helical Insight
1. Go to the location “C:\Helical Insight\hi-repository\System\Drivers” and add hive JDBC4 jar
Download the hive jar, to download jar click here
Note: Location may vary based on the HI Installation directory
2.Go to datasource create page
3.select the managed datasource
4.Give the datasource name
5.Select driver name as org.apache.hive.jdbc.HiveDriver
6.Enter hive url in URL input box
7.Enter username in username input box
8.Enter password in password input box
9.Test the connection
10.If test successful save the datasource.
Creating the metadata in Helical Insight
i. Here we have to create the metadata manually. Sample of the metadata file is provided below.
Click here to download sample metadata
ii.Edit this metadata file and find the connectionId and change the connection id to the datasource id.
iii.Change the table names and column name as per your table names and column name with the appropriate type and default function based on field type and delete the unnecessary tables and columns.
iv. Save this metadata in the hi-repository.
Location: C:\Helical Insight\hi-repository\Cassandra Metadata
Note: Location may vary based on the HI Installation directory
v.Now Go to the metadata edit page.
vi.Now select the metadata which we have created by double click on metadata.
vii. Now you are on the metadata page where you can defile alies/joins/vies/security conditions.
viii.Now save the metadata.
Creation of the Adhoc report
i.Click on the Report and click on Create Report.
Now it will open file-browser to select the metadata, select the metadata by double clicking on metadata.
Now you are in the report create page where you can drag and drop column in the selection area to create report.
iv.Save the report for the future use.