For many of our use cases we are making use of Trino, for example connecting to Mongo, multi db connection when using with Open Source BI product Helical Insight etc. A Trino multi-node setup provides scalable, high-performance, and fault-tolerant data processing with efficient resource utilization and enhanced query parallelism.
In this blog we are going to cover how can you setup Trino on multiple load balancing mode so that the SQLquery execution can happen in parallel, thus helping in better performance as well as a fault tolerant server in case if one server goes down users requests will still be getting processed.
Setting Up a MultinodeTrino Cluster:
Setting up a multinodeTrinocluster involves configuring a coordinator node and one or more worker nodes. Here’s a step-by-step guide for setting up Trino on two servers: one as the coordinator and the other as the worker
1. Install trino on server1
Please refer to the following documentation link regarding how to install Trino
https://helicaltech.com/installing-trino-in-linux-server/
2. Configure Trino Coordinator:
Create the following configuration files in the etc directory of your Trino installation.
node.properties:
node.environment=production node.id=ffffffff-ffff-ffff-ffff-ffffffffffff node.data-dir=/var/trino/data
Note: The node.id should be a unique hexadecimal value
jvm.config:
-server -Xmx4G -XX:InitialRAMPercentage=80 -XX:MaxRAMPercentage=80 -XX:G1HeapRegionSize=32M -XX:+ExplicitGCInvokesConcurrent -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -XX:ReservedCodeCacheSize=512M -XX:PerMethodRecompilationCutoff=10000 -XX:PerBytecodeRecompilationCutoff=10000 -Djdk.attach.allowAttachSelf=true -Djdk.nio.maxCachedBufferSize=2000000 -XX:+UnlockDiagnosticVMOptions -XX:+UseAESCTRIntrinsics
config.properties:
coordinator=true node-scheduler.include-coordinator=true http-server.https.enabled=false http-server.http.port=8080 query.max-memory=5GB query.max-memory-per-node=1GB discovery-server.enabled=true discovery.uri=http://:8080 internal-communication.https.required=false internal-communication.shared-secret=Testing@Helical2024
Note: The shared-secret should be the same in both the coordinator and worker configuration files.
log.properties:
io.trino=INFO
3. Start the Coordinatortrino server
Worker Configuration :
- Install trino on this server as well.
- Configure Trino Workers:
Create the following configuration files in the etc directory of your Trino installation on each worker node.
node.properties:
node.environment=production node.id=9b43ef75-8cf0-4e4f-8aca-27eaf75fa83e node.data-dir=/var/trino/data
Note: The node.id should be a unique hexadecimal value
jvm.config:
-server -Xmx4G -XX:InitialRAMPercentage=80 -XX:MaxRAMPercentage=80 -XX:G1HeapRegionSize=32M -XX:+ExplicitGCInvokesConcurrent -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError -XX:-OmitStackTraceInFastThrow -XX:ReservedCodeCacheSize=512M -XX:PerMethodRecompilationCutoff=10000 -XX:PerBytecodeRecompilationCutoff=10000 -Djdk.attach.allowAttachSelf=true -Djdk.nio.maxCachedBufferSize=2000000 -XX:+UnlockDiagnosticVMOptions -XX:+UseAESCTRIntrinsics
config.properties:
coordinator=false http-server.http.port=8080 discovery.uri=http://:8080 internal-communication.shared-secret=Testing@Helical2024
Note: The shared-secret should be the same in both the coordinator and worker configuration files.
log.properties:
io.trino=INFO
3. Start the Workertrino
Thus we have configured 2 trino instances.
Verification and Management:
Access the Trino web UI at http://:8080 and ensure all workers are listed and active.
Note: The Trino port should be open between the two servers for communication.
It should show 2 active workers
Please reach out to support@helicalinsight.com in case of any questions.