Pentaho BI and Ctools – Introduction

Pentaho Overview:
It offer a suite of open source Business Intelligence (BI) product called Pentaho. Business Analytics provide, Data integration, OLAP Service, Reporting, Dash boarding, Data Mining, ETL Capabilities and BI platform. It is web application like Jaspersoft. It uses Tomcat server.

Pentaho BI Platform:
Pentaho BI Platform supports Pentaho end to end business intelligence capabilities and provide central access to your business information, with back end security, integration, scheduling, auditing and more. Pentaho BI Platform has been designed in such a way that it can be scaled to meet the needs of any size organization.

OLAP Stands For Online Analytical Processing , OLAP is an Approach to answering a multi-dimensional analytical query swiftly.

Dash Boarding:-
Pentaho Dashboards give business users the critical information they need to understand and improve organizational performance. Provides immediate insight into individual, departmental, or enterprise performance by delivering key metrics in an attractive and intuitive visual interface

ETL (Extract Transform Load):-
ETL System are Commonly used to integrate data from multiple application, typically developed and supported by different vendors or hosted on separate computer Hardware.

Data integration:
Data integration involves combining data residing in different sources and providing users with a unified view of these data.

There are many tools present inside Pentaho, the same have been elaborated below,
Pentaho Tools -> Pentaho is having many tools like.

1) PRD – Pentaho Report Designer (Reporting)
Transform all your data into meaningful information tailored to your audience with Reporting, a suite of Open Source tools that allows you to create pixel-perfect reports of your data in PDF, Excel, HTML, Text, Rich-Text-File, XML and CSV. These computer generated reports easily refine data from various sources into a human readable form.

2) PSW – Pentaho Schema Workbench or Mondrian Schema Workbench
The Mondrian Schema Workbench is a designer interface that allows you to create and test Mondrian OLAP cube schemas visually. The Mondrian engine processes MDX requests with the ROLAP (Relational OLAP) schemas. These schema files are XML metadata models that are created in a specific structure used by the Mondrian engine. These XML models can be considered cube-like structures which utilize existing FACT and DIMENSION tables found in your RDBMS. It does not require that an actual physical cube is built or maintained; only that the metadata model is created.

3) PME – Pentaho Meta Editor
Pentaho Metadata Editor (PME) is a tool that allows you to build Pentaho metadata domains and relational data models. A Pentaho Metadata Model maps the physical structure of your database into a logical business model.

4) PDI – Pentaho Data Integration
Pentaho data integration prepares and blends data to create a complete picture of your business that drives actionable insights.

In Server -> There are many tools inside server like.
1) C-Tools ->CDA,CDF,CDE
C-Tools-> Used to create Dashboard.
CDA : Community Dashboard Data Access
CDA – Community Data Access is a community project that allows to fetch data in various formats from Pentaho BI platform. It will be accessed by simple url calls and supports the following datasources:
1)      SQL
2)      MDX
3)      Metadata
4)      Kettle

CDF : Community Dashboard Framework
Community Dashboard Framework (CDF) is a project that allows you to create friendly, powerful and fully featured Dashboards on top of the Pentaho Business Intelligence server.
CDE :Community Dashboard Editor
CDE and the technology underneath it (CDF, CDA and CCC) allows the development and deployment of Pentaho Dashboards in a fast and effective way.

2) Saiku Analytics:-  Explore Data,Visualize  etc


Please get in touch with us for any Pentaho related queries, Helical IT Solutions

How to untar / Extract a TAR file using Java


we will explain how to extract the contents of a TAR file through a Java . In order to decompress TAR file, we will be using Apache Commons Compress library, so make sure you have a copy of this library commons-compress-1.4.1.jar loaded into your classpath. You will also require Apache Commons IO library commons-io-2.4.jar in your classpath as we will use this to write every single extracted file from the TAR archive to disk.

1)You use TarArchiveInputStream to read a TAR file as an InputStream. Once you get a TAR file as an object of this type, you can easily start processing on the file.


import org.apache.commons.compress.utils.*;
import org.apache.commons.compress.archivers.tar.*;
import org.apache.commons.compress.archivers.*;
import org.apache.commons.compress.compressors.gzip.*;
import java.util.*;


File tarFile = new File(c:/test.tar);
File dest = new File(c:/temp/);

TarArchiveInputStream tarIn = new TarArchiveInputStream(
new GzipCompressorInputStream(
new BufferedInputStream(
new FileInputStream(

TarArchiveEntry tarEntry = tarIn.getNextTarEntry();
// tarIn is a TarArchiveInputStream
while (tarEntry != null) {
// create a file with the same name as the tarEntry
File destPath = new File(dest, tarEntry.getName());
System.out.println(“working: ” + destPath.getCanonicalPath());
if (tarEntry.isDirectory()) {
} else {
byte [] btoRead = new byte[2048];
BufferedOutputStream bout =
new BufferedOutputStream(new FileOutputStream(destPath));
int len;
while((len = != -1)

btoRead = null;

tarEntry = tarIn.getNextTarEntry();


Finally , you close all output streams / files opened and that completes the program.

Thanks & Regards,

Vishwanth S

Senior ETL Developer.