Users need to configure connection of Apache Drill to use it as a middleware if they wish to connect to flat files and MongoDB via Apache Drill. Once Middleware configuration is done, users will be able to see icons of the flat files on the Data Sources page. The Apache drill must be installed and started in order to configure it as a middleware. Below are the steps for the configuration:
Step 1: Go to the Management tab on Homepage.
Step 2: Click on Enable. This will enable the use of Drill as a Middleware via which users can connect to flat files (like CSV TSV Sequence Parquet etc) and MongoDB. After the option has been enabled, users will be able to see the icons of flat files and MongoDB on the Data Sources page.
On clicking Enable, the above screen will show.
Step 3: Click on More link above the URL. Insert the following details of the connection to the middleware:
- Host: The IP address at which Drill is present.
- Port: The web console port number of Drill to access it using browser/client tool. The default port number for Drill is 8047.
- Database/User Port: The JDBC connection port for Drill. For Drill, default is 31010.
- Extra Parameters: Extra configuration that needs to be part of the JDBC URL. It is not mandatory.
- Security Enabled: If your connection to the middleware requires username and password, toggle this button. Refer to step 4. Otherwise, the user can skip this step.
- Distributed Mode: Helical Insight supports distributed mode of Apache Drill. If you are using Drill in this mode, toggle this button. Refer to step 5.
Step 4: Click on toggle button Security Enabled to enter username and password to connect to middleware.
- Username: The username required to access the middleware.
- Password: Password required to access the middleware corresponding to the username.
- Security Check Type: The endpoint for form authentication. Default value is /j_security_check.
- Security Mode: Type of security mechanism used by Apache Drill (such as plain, SPNEGO, MAP R etc.). By default, it is plain.
Step 5: If you are using Apache Drill in distributed mode, toggle the Distributed Mode button. You have to enter the details of the Zookeeper Port.
Step 6: If you are using an SSL/HTTPS to connect to the middleware, toggle the SSL/HTTPS button to green. This will update the URL and it will start with https://
Step 7: Enter the details of the Storage Implementation. Apache Drill has 3 types of implementation modes:
- hdfs: Use hdfs storage to upload your flat files into Hadoop ecosystem. Hadoop should be up and running.
- Hdfs Host is IP address of the name node server.
- Hdfs Port is the data node port.
- Data Warehouse Path will be created in Hadoop data node. The path should have read and write access.
- sftp: Use SFTP when the Drill/middleware is installed on a separate server and Helical Insight is installed on a different Server. The files will be uploaded to the server where the drill is running. In case drill/middleware is installed in the Windows machine, please use Linux style path in Data Warehouse path. Example: /C:/Users/Helical/your/path/to/data warehouse
- standalone: Use standalone when middleware and Helical Insight are installed on the same machine. The data warehouse path will be created inside the System directory of the hi-repository folder. All the files uploaded will be saved in that location.
Step 8: Click on Save to save the configuration.
For more information you can email on support@helicalinsight.com