Starting from Helical Insight Enterprise Edition 5.2.2 onwards, we are providing an in-built driver to connect to and use flat files like Excel, CSV, JSON, and Parquet. Hence, without the need of using any middlewares (like Drill etc) now you can directly connect and use CSV within Helical Insight.

We are providing detailed information on how to connect to and use CSV files :

1. Log in to your Helical Insight application and go to the “Data Sources” module.

2. Once you are on “Data Sources” module, you will see many options to connect to different databases. Out of that you need to choose “Flatfile csv”. You can make use of search option at the top right also to search for this specific driver Reference image is provided below:

Usage of Flat files(CSV) in Helical Insight

3. Once you click on ‘Flatfile csv‘ and choose ‘Create‘, a popup will open. Reference image is provided below

Usage of Flat files(CSV) in Helical Insight

4. Host: It can be provided in two ways:

a. Upload the file: If we upload a csv file using the upload option, that file will get uploaded at a specific location in the Helical Insight server and this host section will be filled automatically.

b. Manually provide the csv file path: In this case, the file must be present on the same server where the Helical Insight server is installed and then in the host you will put something like below.

Example for linux path: /usr/local/traveldata.csv

We recommend using the file Upload option

5. In the “Datasource Name” section, we can provide any name of our choice with which the connection name will be saved and listed.

6. In the ‘Configuration Editor‘ we need to provide the configuration details. These details generally change based on the type of flat file being used. In most cases your connection to CSV will work as it is without even the need to go here and make any changes. In most cases you can give Connection Name, Test Connection, Save Connection. And then you can “Create Metadata” and further use Helical Insight.

ADVANCED USE CASES:

There are a lot of other configuration options which are also present. These configuration options can be used for more advanced use cases. In most cases it would not be required though to be tweaked. Some of them we have described below.

7. All the sample configuration details are provided on the icon next to the ‘Configuration Editor‘ An image is provided below

Usage of Flat files(CSV) in Helical Insight

8. Once you click on this icon, it will open a pop up with all the configurations for different flat files.

Usage of Flat files(CSV) in Helical Insight

You will get direct “Copy to clipboard”, option which you can use to copy the content. You need to make necessary changes based on your details.

Explanation of configuration options :

1. tableName:

	Value: "mydata"
	Explanation: This specifies the name of the table that will be created or referenced when you create the metadata. In this above case, the table will be created with the name mydata.

2. strategy:

	Value: "in-memory"
	Explanation: Specifies the data processing strategy.
	"in-memory": Data will be processed and stored temporarily in memory, without persistence to a physical location ( Recommended approach).
	"persistent": Data will be persisted to a physical database file for storage. When the sizes of file are very huge, then this strategy is recommended.

3. persistentLocation:

	Value: "" (empty string)
	Explanation: Indicates the location for persistent storage when the strategy is set to "persistent". For inmemory this setting can be ignored.
	If the strategy is "persistent", a valid file path must be specified here (e.g., C:\\dbs\\test.duckdb).
	When left empty, no persistent storage is configured.

4. extensions:

Value: ["excel", "spatial"]
	Explanation: Specifies the supported file types or processing extensions. For CSV, we should have either excel or spatial.
	"excel" enables the configuration to handle Excel files.
	"spatial" may indicate support for spatial

5. config:

This section contains additional configuration details for processing the flat file.

a. layer:

	Value: ["sheet 1", "sheet 2"]
	Explanation: Specifies the sheet(s) in the Excel file to be processed. Excel file can have multiple sheets and you can specify which all sheets (their names) should be used like the above. If some sheet name is not provided that will get ignored.
	No need to put this for CSV.

b. open_options:

	Value: ["HEADERS=FORCE", "FIELD_TYPES=AUTO"]
	Explanation: These are options for interpreting and processing the data.
	"HEADERS=FORCE" ensures that the first row of the sheet is treated as headers, even if this is not explicitly set in the CSV file.
	"FIELD_TYPES=AUTO" enables automatic detection and assignment of field types (e.g., string, integer, date).

8. We have uploaded the “TravelData.csv” file using the ‘Upload’ option and provided the required configuration in the Configuration Editor

{
  "tableName": "TravelData",
  "strategy": "in-memory",
  "persistentLocation": "",
  "extensions": [
    "spatial"
  ],
  "config": {
    "auto_detect": true,
    "parallel": true
  }
}

9. Click on Test Connection, it gives The connection test is successful

(If there are no issues with configuration) click on Save Datasource

Usage of Flat files(CSV) in Helical Insight

10. Go to the metadata page and expand the Flatfile csv data source. Then expand the CSV File connection. Expand ‘memory‘ and then ‘main’ and it will show the table name that we provided in the data source connection configuration. Drag the table into metadata

Usage of Flat files(CSV) in Helical Insight

11. Create a report using the metadata and save it

Usage of Flat files(CSV) in Helical Insight

NOTE: In the configuration, we can even add more and more configuration options also. Below can be referred.

Name Description Type Default
all_varchar Option to skip type detection for CSV parsing and assume all columns to be of type VARCHAR. BOOL false
allow_quoted_nulls Option to allow the conversion of quoted values to NULL values BOOL true
auto_detect Enables auto detection of CSV parameters. BOOL true
auto_type_candidates Allows specifying types for CSV column type detection. VARCHAR is always included as a fallback. TYPE[] default types
columns Specifies the column names and types within the CSV file (e.g., {‘col1’: ‘INTEGER’, ‘col2’: ‘VARCHAR’}). Implies no auto detection. STRUCT (empty)
compression The compression type for the file. Auto-detected by default (e.g., t.csv.gz -> gzip, t.csv -> none). VARCHAR auto
dateformat Specifies the date format to use when parsing dates. See Date Format example below. VARCHAR (empty)
decimal_separator The decimal separator of numbers. VARCHAR .
delim or sep Specifies the character that separates columns within each row. VARCHAR ,
escape Specifies the string used to escape data character sequences matching the quote value. VARCHAR
filename Whether an extra filename column should be included in the result. BOOL false
force_not_null Do not match specified columns’ values against the NULL string. VARCHAR[] []
header Specifies that the file contains a header line with the names of each column. BOOL false
hive_partitioning Whether or not to interpret the path as a Hive partitioned path. BOOL false
ignore_errors Option to ignore any parsing errors encountered and ignore rows with errors. BOOL false
max_line_size The maximum line size in bytes. BIGINT 2097152
names The column names as a list. VARCHAR[] (empty)
new_line Set the new line character(s) in the file. Options are ‘\r’,’\n’, or ‘\r\n’. VARCHAR (empty)
normalize_names Whether column names should be normalized by removing non-alphanumeric characters. BOOL false
null_padding Pads remaining columns on the right with null values if a row lacks columns. BOOL false
nullstr Specifies the string or list of strings that represent a NULL value. VARCHAR or VARCHAR[] (empty)
parallel Whether or not the parallel CSV reader is used. BOOL true
quote Specifies the quoting string to be used when a data value is quoted. VARCHAR
sample_size The number of sample rows for auto detection of parameters. BIGINT 20480
skip The number of lines at the top of the file to skip. BIGINT 0
timestampformat Specifies the date format to use when parsing timestamps. VARCHAR (empty)
types or dtypes The column types as either a list (by position) or a struct (by name). VARCHAR[] or STRUCT (empty)
union_by_name Whether the columns of multiple schemas should be unified by name rather than by position. BOOL false

One Reply to “Usage of Flat files(CSV) in Helical Insight”

Leave a Reply

Helical Insight’s self-service capabilities is one to reckon with. It allows you to simply drag and drop columns, add filters, apply aggregate functions if required, and create reports and dashboards on the fly. For advanced users, the self-service component has ability to add javascript, HTML, HTML5, CSS, CSS3 and AJAX. These customizations allow you to create dynamic reports and dashboards. You can also add new charts inside the self-service component, add new kind of aggregate functions and customize it using our APIs.
Helical Insight’s self-service capabilities is one to reckon with. It allows you to simply drag and drop columns, add filters, apply aggregate functions if required, and create reports and dashboards on the fly. For advanced users, the self-service component has ability to add javascript, HTML, HTML5, CSS, CSS3 and AJAX. These customizations allow you to create dynamic reports and dashboards. You can also add new charts inside the self-service component, add new kind of aggregate functions and customize it using our APIs.
Helical Insight, via simple browser based interface of Canned Reporting module, also allows to create pixel perfect printer friendly document kind of reports also like Invoice, P&L Statement, Balance sheet etc.
Helical Insight, via simple browser based interface of Canned Reporting module, also allows to create pixel perfect printer friendly document kind of reports also like Invoice, P&L Statement, Balance sheet etc.
If you have a product, built on any platform like Dot Net or Java or PHP or Ruby, you can easily embed Helical Insight within it using iFrames or webservices, for quick value add through instant visualization of data.
If you have a product, built on any platform like Dot Net or Java or PHP or Ruby, you can easily embed Helical Insight within it using iFrames or webservices, for quick value add through instant visualization of data.
Being a 100% browser-based BI tool, you can connect with your database and analyse across any location and device. There is no need to download or install heavy memory-consuming developer tools – All you need is a Browser application! We are battle-tested on most of the commonly used browsers.
Being a 100% browser-based BI tool, you can connect with your database and analyse across any location and device. There is no need to download or install heavy memory-consuming developer tools – All you need is a Browser application! We are battle-tested on most of the commonly used browsers.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
We have organization level security where the Superadmin can create, delete and modify roles. Dashboards and reports can be added to that organization. This ensures multitenancy.
A first-of-its-kind Open-Source BI framework, Helical Insight is completely API-driven. This allows you to add functionalities, including but not limited to adding a new exporting type, new datasource type, core functionality expansion, new charting in adhoc etc., at any place whenever you wish, using your own in-house developers.
A first-of-its-kind Open-Source BI framework, Helical Insight is completely API-driven. This allows you to add functionalities, including but not limited to adding a new exporting type, new datasource type, core functionality expansion, new charting in adhoc etc., at any place whenever you wish, using your own in-house developers.
It handles huge volumes of data effectively. Caching, Pagination, Load-Balancing and In-Memory not only provides you with amazing experience, but also and does not burden the database server more than required. Further effective use of computing power gives best performance and complex calculations even on the big data even with smaller machines for your personal use. Filtering, Sorting, Cube Analysis, Inter Panel Communication on the dashboards all at lightning speed. Thereby, making best open-source Business Intelligence solution in the market.
It handles huge volumes of data effectively. Caching, Pagination, Load-Balancing and In-Memory not only provides you with amazing experience, but also and does not burden the database server more than required. Further effective use of computing power gives best performance and complex calculations even on the big data even with smaller machines for your personal use. Filtering, Sorting, Cube Analysis, Inter Panel Communication on the dashboards all at lightning speed. Thereby, making best open-source Business Intelligence solution in the market.
With advance NLP algorithm, business users simply ask questions like, “show me sales of last quarter”, “average monthly sales of my products”. Let the application give the power to users without knowledge of query language or underlying data architecture
With advance NLP algorithm, business users simply ask questions like, “show me sales of last quarter”, “average monthly sales of my products”. Let the application give the power to users without knowledge of query language or underlying data architecture
Our application is compatible with almost all databases, be it RDBMS, or columnar database, or even flat files like spreadsheets or csv files. You can even connect to your own custom database via JDBC connection. Further, our database connection can be switched dynamically based on logged in users or its organization or other parameters. So, all your clients can use the same reports and dashboards without worrying about any data security breech.
Our application is compatible with almost all databases, be it RDBMS, or columnar database, or even flat files like spreadsheets or csv files. You can even connect to your own custom database via JDBC connection. Further, our database connection can be switched dynamically based on logged in users or its organization or other parameters. So, all your clients can use the same reports and dashboards without worrying about any data security breech.
Our application can be installed on an in-house server where you have full control of your data and its security. Or on cloud where it is accessible to larger audience without overheads and maintenance of the servers. One solution that works for all.
Our application can be installed on an in-house server where you have full control of your data and its security. Or on cloud where it is accessible to larger audience without overheads and maintenance of the servers. One solution that works for all.
Different companies have different business processes that the existing BI tools do not encompass. Helical Insight permits you to design your own workflows and specify what functional module of BI gets triggered
Different companies have different business processes that the existing BI tools do not encompass. Helical Insight permits you to design your own workflows and specify what functional module of BI gets triggered