Load Balancing in Jaspersoft and Pentaho and Designing a Cluster
Load balancing is often done in BI infrastructure to have good scalability and availability. There can be a number of methods of implementing the same, one of the method is by using Load balancer and the BI server (jasper report server or Pentaho Server) can run behind the same. One very important factor to be kept in mind is the versions of the BI server should be the same (either both Community edition or both Enterprise edition). It’s also preferable that the BI suite version number is also the same in both the instances and the configurations are also exactly the same.
A cluster is a group of server and a properly designed cluster can support many users and organizations, avoid downtime, fail-over, load balancing and plan for future growth as well. This type of architecture can also take care of any future enhancement. An end user is not aware of this architecture and he will access the same URL.
Jaspersoft works on sticky session mechanism i.e. if switching happens from one server to another user sessions are lost.
Load Balancer: It’s a hardware or software to spread traffic between the cluster of servers. To an end user he will access the application via the same web URL, the load balancer will gauge the load on the different servers, and thus accordingly will spread the requests so that end user will have the best speed and response time. Its recommended that the Pentaho or the Jaspersoft server hardwares are exactly the same, which can thus lead to more effective work of load balancing by load balancer.
Different algorithms can be used at the time of load balancing like round-robin or load based or anything else.
Shared repository database:
This is having all the reports, folders, role, user, security and other resources. Whatever operations are done in the repository they are thread-safe, which means many operations can run simultaneously. Also there are internal locks present which prevents conflicting operations.
Job scheduler is responsible for accessing and executing jobs at predefined schedule. These job schedulers also have locking mechanish, to make sure that simultaneous triggering dosent happen.
Also all the nodes need to be clock synced.
Session Management and Failover
Client session information is stored in –memory. After a user logs in or a web services clien’t sends a request with credentials, the session contains the user profile such as organization and role membership for use in enforcing permissions. For browser users, the session also stores information about the state of the web interface, for example the last folder viewed in the repository, the last search term, or the data entered in a wizard….. In commercial editions with the Ad Hoc Editor and Dashboard Designer, the user session also stores the on-screen state of the report or dashboard that the browser user creates interactively.
There are two types of session management which can happen, and each of them handles the failover in a different way
– Replicated or persistent sessions: Instantaneous session information of the sessions are continuously stored at a shared location. Whenever any failure happens, load balancer automatically redirects to another server node. Since the other server also has access to all the information, this happens seam-lessly without end user experience getting affected.
– Sticky or Pinned session: Here the sessions are managed and accessed by private servers, hence at the time of failure sessions are lost. After load balancer connects to other server, a new session is initiated.
For more information about Jaspersoft BI suite or Pentaho BI suite, get in touch with us at [email protected]