Beginner’s Guide to OLTP and OLAP

Beginner’s Guide to OLTP and OLAP

In this blog we will be looking into the introduction of OLTP and OLAP. How it is defined and how we can relate it to real-time

Firstly, OLTP stands for On-line Transaction Processing and OLAP stands for On-line Analytical Processing

But to define OLTP and OLAP in the Simpliest of language is
OLTP is defined as “What is Happening” ie it deals with various transactions as it happens
and OLAP is defined as “What has happened” ie it deals with Stored Data.

OLTP is more of INSERT, DELETE, UPDATE, MODIFY i.e perform Day to Day Operations
But OLAP on the other hand is more of Fetching Data i.e Perform Analysis.

Take for Example a Supermarket Counter : the Counter receives payment for a particular product thus the event happening is a transaction which should be inputted into the Database
Hence it’s a OLTP.

But the OLAP is once the Data has been stored and the only Operation performed is Fetching the Data then it is said to be OLAP which can be used for
Business decision making and planning.

OLTP is a Relational Database and it’s 2-Dimensional in nature i.e Mathematically it is in the form of x and y axis
but OLAP on the other hand is a Multi-dimensional Database i.e Mathematically it is in the form of x, y, z, nth axis.

Now a question arises how is OLAP and OLTP represented? OLTP is represented in a Normalized Form and OLAP in a Star Schema or even at times in a Snow Flake Schema

OLAP Data- Structure is Designed in a Tree Format making it easy to Traverse.

Thanks,

Sohail

Pentaho Analysis: Workflow

OLAP Analysis: Workflow

 

To prepare data for use with the Pentaho/Jaspersoft OLAP Analysis (and Reporting, to a certain extent) client tools, you should follow this basic workflow :

Design a Star or Snowflake Schema

The entire process starts with a data warehouse.The end result should be data model in the star or snowflake schema pattern. You don’t have to worry too much about getting the model exactly right on your first try. Just cover all of your anticipated business needs; part of the process is coming back to the data warehouse design step and making changes to your initial data model after you’ve discovered what your operational needs are.

Populate the Star/Snowflake Schema

Once your data model is designed, the next step is to populate it with actual data, thereby creating your data warehouse. We can use  Pentaho Data Integration or Talend for ETL.

Build a Mondrian Schema

Now that your initial data warehouse project is complete, you must build a Mondrian schema to organize and describe it in terms that Pentaho/Jasper Analysis can understand. This is also accomplished through Pentaho Data Integration by using the Agile BI plugin. Just connect to your data warehouse and auto-populate your schema with the Modeler graphical interface. Alternatively, you can use Schema Workbench to create an analysis schema through a manual process.

Initial Testing

At this point you should have a multi-dimensional data structure with an appropriate metadata layer. You can now start using Pentaho Analyzer, Jasper Ad-hoc and JPivot to drill down into your data and see if your first attempt at data modeling was successful. In all likelihood, it will need some adjustment, so take note of all of the schema limitations that you’re unhappy with during this initial testing phase.

Do not be concerned with performance issues at this time – just concentrate on the completeness and comprehensiveness of the data model.

Adjust and Repeat Until Satisfied

Use the notes you took during the testing phase to redesign your data warehouse and Mondrian schema appropriately. Adjust hierarchies and relational measure aggregation methods. Create virtual cubes for analyzing multiple fact tables by conforming dimensions. Re-test the new implementation and continue to refine the data model until it matches your business needs perfectly.

Test for Performance

Once you’re satisfied with the design and implementation of your data model, you should try to find performance problems and address them by tuning your data warehouse database, and by creating aggregation tables. The testing can only be reasonably done by hand, using Pentaho Analyzer, Jaspersoft Ad-hoc or JPivot. Take note of all of the measures that take an unreasonably long time to calculate. Also, enable SQL logging and locate slow-performing queries, and build indexes for optimizing query performance.

Create Aggregation Tables

Using your notes as a guide, create aggregation tables using Pentaho Aggregation Designer or talend to store frequently computed analysis views. Re-test and create new aggregation tables as necessary.

If you are working with a relatively small data warehouse or a limited number of dimensions, you may not have a real need for aggregation tables. However, be aware of the possibility that performance issues may come up in the future.

Check in with your users occasionally to see if they have any particular concerns about the speed of their BI content.

Deploy to Production

Your data warehouse and Mondrian schema have been created, tested, and refined. You’re now ready to put it all into production. OLAP reports help users to do multi-dimensional and cross functional analysis, slice and dice, drill-up and drill-through the data. It gives power in the hands of business users to generate ad-hoc reports and get better insight of their business and and make better decisions based on OLAP analysis.

CUBES

 Cubes

The data structures used in the OLAP are multidimensional data cubes or OLAP cubes:

image1

An OLAP cube is a multidimensional database that is optimized for data warehouse and online analytical processing (OLAP) applications.

An OLAP cube is an array of data understood in terms of its 0 or more dimensions. OLAP is an acronym for online analytical processing. OLAP is a computer-based technique for analyzing business data in the search for business intelligence.

So, in simple it is the data presentation in multiple dimensions.

Example: A company needs survey of their products all over the world.

So for this it requires answer of some questions like

–          Sell in USA?

–          Sell in USA of different products?

–          Sell in USA of different products in last 6 month?

–          Sell in Other country and total sell?

–          What type of consumers attract by this product?

–          Product details, with sell?

…etc

In this case we need one master table/Fact table where we store values commonly used (like last month sell) and also the relation of other table by using foreign key (Customer detail , USA sell). In such a manner that data retrieve much faster as possible.

 

The OLAP cube consists of facts, also called measures, categorized by dimensions (it can be much more than 3 Dimensions; dimensions referred from Fact Table by “foreign keys”). Measures are derived from the records in the Fact Table and Dimensions are derived from the dimension tables, where each column represents one attribute (also called dictionary; dimension can have many attributes)

Facts and Measures

Fact is most detailed information that can be measured.

In simple it works like central control, which has information of whole cube like where we find which thing.

image2

OLAP Common operations include slice and dice, drill down, roll up, and pivot:

 

Slice:

A slice is a subset of a multi-dimensional array corresponding to a single value for one or more members of the dimensions not in the subset.

image3

 

Dice:

The dice operation is a slice on more than two dimensions of a data cube (or more than two consecutive slices).

image4

 

Drill Down/Up:

Drilling down or up is a specific analytical technique whereby the user navigates among levels of data ranging from the most summarized (up) to the most detailed (down).

image5

 

Roll-up:

(Aggregate, Consolidate) A roll-up involves computing all of the data relationships for one or more dimensions. To do this, a computational relationship or formula might be defined.

image6

 

Pivot:

This operation is also called rotate operation. It rotates the data in order to provide an alternative presentation of data – the report or page display takes a different dimensional orientation.

image7

 

Summary

OLAP cube is really helpful, when we are talking about billions of data. It is very easy and straightforward to translate business questions into multidimensional query.

By which we can really increase data performance as well as understandability of data in a better manner.

Reusability is also possible in cube, which make this more scalable.

Thanks for reading : Pushpraj Kumar ( Helical IT Solutions )

Understanding and Usage of Mondrian

This blog will be talking about the different layers of Mondrian engine, its components, introduction to ROLAP etc.

Mondrian Architecture

 

Layers of a Mondrian system

A Mondrian OLAP System consists of four layers; these are as follows:

  1. Presentation layer

The presentation layer determines what the end-user sees on his or her monitor, and how he or she can interact to ask new questions. There are many ways to present multidimensional datasets, including pivot tables, pie, line and bar charts, and advanced visualization tools such as clickable maps and dynamic graphics. These might be written in Swing or JSP, charts rendered in JPEG or GIF format, or transmitted to a remote application via XML. What all of these forms of presentation have in common is the multidimensional ‘grammar’ of dimensions, measures and cells in which the presentation layer asks the question is asked, and OLAP server returns the answer.

  1. Dimensional layer

The second layer is the dimensional layer. The dimensional layer parses, validates and executes MDX queries. A query is evaluated in multiple phases. The axes are computed first, then the values of the cells within the axes. For efficiency, the dimensional layer sends cell-requests to the aggregation layer in batches. A query transformer allows the application to manipulate existing queries, rather than building an MDX statement from scratch for each request. And metadata describes the dimensional model, and how it maps onto the relational model.

  1. Star layer

The third layer is the star layer, and is responsible for maintaining an aggregate cache. An aggregation is a set of measure values (‘cells’) in memory, qualified by a set of dimension column values. The dimensional layer sends requests for sets of cells. If the requested cells are not in the cache, or derivable by rolling up an aggregation in the cache, the aggregation manager sends a request to the storage layer.

  1. Storage layer.

The storage layer is an RDBMS. It is responsible for providing aggregated cell data, and members from dimension tables.

These components can all exist on the same machine, or can be distributed between machines. Layers 2 and 3, which comprise the Mondrian server, must be on the same machine. The storage layer could be on another machine, accessed via remote JDBC connection. In a multi-user system, the presentation layer would exist on each end-user’s machine (except in the case of JSP pages generated on the server).

ROLAP Defined

Pentaho Analysis is built on the Mondrian relational online analytical processing (ROLAP) engine. ROLAP relies on a multidimensional data model that, when queried, returns a dataset that resembles a grid. The rows and columns that describe and bring meaning to the data in that grid are dimensions, and the hard numerical values in each cell are the measures or facts. In Pentaho Analyzer, dimensions are shown in yellow and measures are in blue.

ROLAP requires a properly prepared data source in the form of a star or snowflake schema that defines a logical multidimensional database and maps it to a physical database model. Once you have your initial data structure in place, you must design a descriptive layer for it in the form of a Mondrian schema, which consists of one or more cubes, hierarchies, and members. Only when you have a tested and optimized Mondrian schema is your data prepared on a basic level for end-user tools like Pentaho Analyzer and JPivot.

 

Workflow Overview

To prepare data for use with the Pentaho Analysis (and Reporting, to a certain extent) client tools, you should follow this basic workflow:

Design a Star or Snowflake Schema

The entire process starts with a data warehouse.The end result should be data model in the star or snowflake schema pattern. You don’t have to worry too much about getting the model exactly right on your first try. Just cover all of your anticipated business needs; part of the process is coming back to the data warehouse design step and making changes to your initial data model after you’ve discovered what your operational needs are.

Populate the Star/Snowflake Schema

Once your data model is designed, the next step is to populate it with actual data, thereby creating your data warehouse. The best tool for this job is Pentaho Data Integration, an enterprise-grade extract, transform, and load (ETL) application.

Build a Mondrian Schema

Now that your initial data warehouse project is complete, you must build a Mondrian schema to organize and describe it in terms that Pentaho Analysis can understand. This is also accomplished through Pentaho Data Integration by using the Agile BI plugin. Just connect to your data warehouse and auto-populate your schema with the Modeler graphical interface. Alternatively, you can use Pentaho Schema Workbench to create an analysis schema through a manual process.

Initial Testing

At this point you should have a multi-dimensional data structure with an appropriate metadata layer. You can now start using Pentaho Analyzer and JPivot to drill down into your data and see if your first attempt at data modeling was successful. In all likelihood, it will need some adjustment, so take note of all of the schema limitations that you’re unhappy with during this initial testing phase.

Do not be concerned with performance issues at this time — just concentrate on the completeness and comprehensiveness of the data model.

Adjust and Repeat Until Satisfied

Use the notes you took during the testing phase to redesign your data warehouse and Mondrian schema appropriately. Adjust hierarchies and relational measure aggregation methods. Create virtual cubes for analyzing multiple fact tables by conforming dimensions. Re-test the new implementation and continue to refine the data model until it matches your business needs perfectly.

Test for Performance

Once you’re satisfied with the design and implementation of your data model, you should try to find performance problems and address them by tuning your data warehouse database, and by creating aggregation tables. The testing can only be reasonably done by hand, using Pentaho Analyzer and/or JPivot. Take note of all of the measures that take an unreasonably long time to calculate. Also, enable SQL logging and locate slow-performing queries, and build indexes for optimizing query performance.

Create Aggregation Tables

Using your notes as a guide, create aggregation tables in Pentaho Aggregation Designer to store frequently computed analysis views. Re-test and create new aggregation tables as necessary.

If you are working with a relatively small data warehouse or a limited number of dimensions, you may not have a real need for aggregation tables. However, be aware of the possibility that performance issues may come up in the future.

Check in with your users occasionally to see if they have any particular concerns about the speed of their BI content.

Deploy to Production

Your data warehouse and Mondrian schema have been created, tested, and refined. You’re now ready to put it all into production. You may need to personally train or purchase Pentaho training for anyone in your organization who needs to create traditional reports, dashboards, or Analyzer reports with Pentaho’s client tools.

Conditional Formatting in jaspersoft’s OLAP view (mondrian jPivot)

 Conditional Formatting in jaspersoft’s OLAP view (mondrian jPivot)

  1. Open {js-install}\apache-tomcat\webapps\jasperserver-pro\jpivot\table\mdxtable.css and {js-install}\apache-tomcat\webapps\jasperserver-pro\j\mdxtable.css.
  2. In these files find the following section

/* data cells */

td.cell-even,

td.cell-odd,

td.cell-red,

td.cell-yellow,

td.cell-green,

td.cell-rot,

td.cell-gelb,

td.cell-gruen,

td.property-value,

td.property-span {

font-family: Arial, Helvetica, Tahoma, sans-serif;

font-size: 80%;

/*color : Black;*/

color : #033669;

text-align: right;

/*border-bottom: solid 1px #000033;

border-right: solid 1px #000033;

border-top: solid 1px #000033;

border-left: solid 1px #000033;*/

}

  1. Add 3 styles for 3 different colors as shown below

/* data cells */

td.cell-even,

td.cell-odd,

td.cell-red,

td.cell-yellow,

td.cell-green,

td.cell-rot,

td.cell-gelb,

td.cell-gruen,

td.cell-below_avg_red,

td.cell-avg_orange,

td.cell-above_avg_green,

td.property-value,

td.property-span {

font-family: Arial, Helvetica, Tahoma, sans-serif;

font-size: 80%;

/*color : Black;*/

color : #033669;

text-align: right;

/*border-bottom: solid 1px #000033;

border-right: solid 1px #000033;

border-top: solid 1px #000033;

border-left: solid 1px #000033;*/

}

  1. Then a little down you will see some blocks that look like this. This one in particular is for green.td.cell-gruen,
    td.cell-green,
    span.cell-gruen,
    span.cell-green
    {
    background-color: #66FF66;
    }

    So, now just copy this one, and make it for 3 new styles added like this:

 td.cell-below_avg_red,

span.cell-below_avg_red

{

background-color: #FF0000;

}

 

td.cell-avg_orange,

span.cell-avg_orange

{

background-color: #FFBF00;

}

 td.cell-above_avg_green,

span.cell-above_avg_green

{

background-color: #04B404;

}

  1. Now, within the MDX query you can reference these styles as shown below:

 a)      with member [Measures].[Profit] as ‘([Measures].[Store Sales] – [Measures].[Store Cost])’, FORMAT_STRING = CASE WHEN (([Measures].[Profit] <= 300000.0) AND ([Measures].[Profit] > 100000.0)) THEN “|#|style=above_avg_green” WHEN (([Measures].[Profit] <= 100000.0) AND ([Measures].[Profit] > 50000.0)) THEN “|#|style=avg_orange” ELSE “|#|style=below_avg_red” END select {[Measures].[Store Sales], [Measures].[Profit]} ON COLUMNS, {[Product].CurrentMember.Children} ON ROWS from [Sales]

2

b)      with member [Measures].[Profit] as ‘([Measures].[Store Sales] – [Measures].[Store Cost])’, FORMAT_STRING = CASE WHEN (([Measures].[Profit] <= 101.0) AND ([Measures].[Profit] > 75.0)) THEN “|#|style=above_avg_green” WHEN (([Measures].[Profit] <= 75.0) AND ([Measures].[Profit] > 50.0)) THEN “|#|style=avg_orange” ELSE “|#|style=below_avg_red” END

select {([Time].[2012].[Q1], [Measures].[Profit]), ([Time].[2012].[Q2], [Measures].[Profit]), ([Time].[2012].[Q3], [Measures].[Profit]), ([Time].[2012].[Q4], [Measures].[Profit])} ON COLUMNS,

{[Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Cola], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Cream Soda], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Diet Cola], [Product].[Drink].[Beverages].[Carbonated Beverages].[Soda].[Excellent].[Excellent Diet Soda], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Apple Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Berry Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Cranberry Juice], [Product].[Drink].[Beverages].[Pure Juice Beverages].[Juice].[Excellent].[Excellent Orange Juice]} ON ROWS

from [Sales]

1

Steps to migrate oracle with pentaho

Step 1:-

Run script as DB admin.

Script is available in location:- biserver-ce\data\oracle10g.

Modify configuration file:-

  1. applicationContext-spring-security-hibernate.properties.

Location:-

pentaho-solutions\system\applicationContext-spring-security-hibernate.properties.

original code:-

jdbc.driver=org.hsqldb.jdbcDriver

jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.HSQLDialect

Modified code:-

dbc.driver=oracle.jdbc.OracleDriver

jdbc.url=jdbc:oracle:thin:@localhost:1521/sysdba

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.Oracle10gDialect

  1. hibernate-settings.xml

Location:- pentaho-solutions\system\hibernate\hibernate-settings.xml.

Original code

<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>

Modified code:-

<config-file>system/hibernate/oracle10g.hibernate.cfg.xml</config-file>

oracle10g.hibernate.cfg.xml:-

Location:-

pentaho-solutions\system\hibernate\ oracle10g.hibernate.cfg.xml

Do not need to change any code in this file.. Just need to check everything is perfect or not

 

   <!– Oracle 10g Configuration –>

<property name=”connection.driver_class”>oracle.jdbc.driver.OracleDriver</property>

<property name=”connection.url”>jdbc:oracle:thin:@localhost:1521/sysdba

</property>

<property name=”dialect”>org.hibernate.dialect.Oracle10gDialect</property>

<property name=”connection.username”>hibuser</property>

<property name=”connection.password”>password</property>

<property name=”connection.pool_size”>10</property>

<property name=”show_sql”>false</property>

<property name=”hibernate.jdbc.use_streams_for_binary”>true</property>

<!– replaces DefinitionVersionManager –>

<property name=”hibernate.hbm2ddl.auto”>update</property>

<!– load resource from classpath –>

<mapping resource=”hibernate/oracle10g.hbm.xml” />

 

  1. quartz.properties:-

Location:-

pentaho-solutions\system\quartz\quartz.properties

Original Code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

Modified Code:-

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.oracle.OracleDelegate

4.context.xml:-

Location:-

tomcat\webapps\pentaho\META-INF\context.xml

Original Code

<Resource name=”jdbc/Hibernate” auth=”Container” type=”javax.sql.DataSource”

factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″

maxWait=”10000″ username=”hibuser” password=”password”

driverClassName=”org.hsqldb.jdbcDriver” url=”jdbc:hsqldb:hsql://localhost/hibernate

validationQuery=”select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES” />

 

<Resource name=”jdbc/Quartz” auth=”Container” type=”javax.sql.DataSource”

factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″

maxWait=”10000″ username=”pentaho_user” password=”password”

driverClassName=”org.hsqldb.jdbcDriver” url=”jdbc:hsqldb:hsql://localhost/quartz

validationQuery=”select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES”/>

 

Modified Code:-

<Resource validationQuery=”select 1 from dual”

url=” jdbc:oracle:thin:@localhost:1521/sysdba

driverClassName=”oracle.jdbc.OracleDriver” password=”password”

username=”hibuser” maxWait=”10000″ maxIdle=”5″ maxActive=”20″

factory=”org.apache.commons.dbcp.BasicDataSourceFactory”

type=”javax.sql.DataSource” auth=”Container” name=”jdbc/Hibernate”/>

                       

<Resource validationQuery=”select 1 from dual”

url=” jdbc:oracle:thin:@localhost:1521/sysdba

driverClassName=”oracle.jdbc.OracleDriver” password=”password”

username=”quartz” maxWait=”10000″ maxIdle=”5″ maxActive=”20″

factory=”org.apache.commons.dbcp.BasicDataSourceFactory”

type=”javax.sql.DataSource” auth=”Container” name=”jdbc/Quartz”/>

6. repository.xml

Location of the file: pentaho-solutions\system\jackrabbit\repository.xml.

Comment this code means (<! – – every thing here – -> )

Active means: Remove comment

i)                    FileSystem part

Comment this code

<FileSystem class=”org.apache.jackrabbit.core.fs.local.LocalFileSystem”>

     <param name=”path” value=”${rep.home}/repository”/>

</FileSystem>

Active this code:-

<FileSystem class=”org.apache.jackrabbit.core.fs.db.OracleFileSystem”>

   <param name=”url” value=”jdbc:oracle:thin:@localhost:1521“/>

   <param name=”user” value=”jcr_user”/>

   <param name=”password” value=”password”/>

   <param name=”schemaObjectPrefix” value=”fs_repos_”/>

   <param name=”tablespace” value=”pentaho_tablespace”/>

</FileSystem>

ii)                  DataStore part

Comment this code

<DataStore class=”org.apache.jackrabbit.core.data.FileDataStore”/>

Active this code:-

<DataStore class=”org.apache.jackrabbit.core.data.db.DbDataStore”>

<param name=”url” value=”jdbc:oracle:thin:@localhost:1521/sysdba”/>

<param name=”driver” value=”oracle.jdbc.OracleDriver”/>

<param name=”user” value=”jcr_user”/>

<param name=”password” value=”password”/>

<param name=”databaseType” value=”oracle”/>

<param name=”minRecordLength” value=”1024″/>

<param name=”maxConnections” value=”3″/>

<param name=”copyWhenReading” value=”true”/>

<param name=”tablePrefix” value=””/>

<param name=”schemaObjectPrefix” value=”ds_repos_”/>

</DataStore>

iii)                Security part in the FileSystem Workspace part

Comment this code:-

<FileSystem class=”org.apache.jackrabbit.core.fs.local.LocalFileSystem”>

<param name=”path” value=”${wsp.home}”/>

</FileSystem>

Active this code:-

<FileSystem class=”org.apache.jackrabbit.core.fs.db.OracleFileSystem”>

<param name=”url” value=”jdbc:oracle:[email protected]:1521/sysdba”/>

<param name=”user” value=”jcr_user”/>

<param name=”password” value=”password”/>

<param name=”schemaObjectPrefix” value=”fs_ws_”/>

<param name=”tablespace” value=”pentaho_tablespace”/>

</FileSystem>

iv)       PersistenceManager part

Comment this code:-

<PersistenceManager class=”org.apache.jackrabbit.core.persistence.pool.H2PersistenceManager”>

<param name=”url” value=”jdbc:h2:${wsp.home}/db”/>

<param name=”schemaObjectPrefix” value=”${wsp.name}_”/>

</PersistenceManager>

Active This Code:-

<PersistenceManager class=”org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager”>

<param name=”url” value=”jdbc:oracle:thin:@localhost:1521/sysdba”/>

<param name=”driver” value=”oracle.jdbc.OracleDriver”/>

<param name=”user” value=”jcr_user”/>

<param name=”password” value=”password”/>

<param name=”schema” value=”oracle”/>

<param name=”schemaObjectPrefix” value=”${wsp.name}_pm_ws_”/>

<param name=”tablespace” value=”pentaho_tablespace”/>

</PersistenceManager>

v)       FileSystem Versioning part

Comment This Code:-

<FileSystem class=”org.apache.jackrabbit.core.fs.local.LocalFileSystem”>

<param name=”path” value=”${rep.home}/version” />

</FileSystem>

Active This Code:-

<PersistenceManager class=”org.apache.jackrabbit.core.persistence.bundle.OraclePersistenceManager”>

<param name=”url” value=”jdbc:oracle:thin:@localhost:1521/sysdba”/>

<param name=”driver” value=”oracle.jdbc.OracleDriver”/>

<param name=”user” value=”jcr_user”/>

<param name=”password” value=”password”/>

<param name=”schema” value=”oracle”/>

<param name=”schemaObjectPrefix” value=”pm_ver_”/>

<param name=”tablespace” value=”pentaho_tablespace”/>

</PersistenceManager>

 

 

Stopping HSQL db start up

In web.xml file

Comment or delete this code (Commenting is preferable)

<!– [BEGIN HSQLDB DATABASES] –>

<context-param>

<param-name>hsqldb-databases</param-name>

<param-value>[email protected]/../data/hsqldb/sampledata,[email protected]/../data/hsqldb/hibernate,[email protected]/../data/hsqldb/quartz</param-value>

</context-param>

<!– [END HSQLDB DATABASES] –>

 

Also comment this code

<!– [BEGIN HSQLDB STARTER] –>

<listener>

<listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class>

</listener>

<!– [END HSQLDB STARTER] –>

You have done with integrating pentaho 5.0.1 CE with Oracle

Now login to the Pentaho server .

URL:  http://localhost:8080/pentaho

Username/Password : Admin/password

Date Calculation using Joda Time Library In iReport

Dear readers, this blog talks about Date Calculation using Joda Time Library In iReport

By using Joda Time Library , we can give default value expression for any datetype parameter .

Steps for using joda time library in iReport  :

  1. Download joda time library ,

2. Put its jar file “joda-time.jar” in the path of iReport : Tools/Options/Classpath/.

joda ireport

3. In the main report section properties , set import property to “org.joda.time.DateTime

Without  double quotes.

joda ireport 2

4. Now with the above configuration , we will be able to do any type of date calculation

 

5. Example :

   a. Start Day Of Current Month :

Take two parameters : now and startday respectively

For ‘now’ parameter give its class name as ‘java.util.Date’ and default value expression as new DateTime.now().dayOfMonth().withMinimumValue().withTimeAtStartOfDay()

 

b. Last day of previous Month :

Take two parameters : now and endday respectively

For ‘now’ parameter give its class name as ‘java.util.Date’ and default value expression as new DateTime.now().minusMonths(1).dayOfMonth().withMaximumValue()

In the similar way we can do any type of date calculation for default value expression.

Rupam Bhardwaj

Helical IT Solutions

HOW TO CREATE LOCK FILE IN C

 

File locking provides a very simple yet incredibly useful mechanism for coordinating file accesses. Before I begin to lay out the details, let me fill you in on some file locking secrets:-

There are two types of locking mechanisms: mandatory and advisory.

– Mandatory systems will actually prevent read() and write() to file.

– There are two types of (advisory!) locks: read locks and write locks (also referred to as shared locks and exclusive locks, respectively.)

 

Here I am demonstrating READ-LOCK …

In  my  case , I want  to reduce CPU usage which  takes very  high CPU usage % in case of accessing Sqlite3  by  two  application simultaneously.

To solve  this problem  I  applied  lock  file  concept  to synchronizes  applications ,

I applied it for three different applications.

 

Steps followed while creating lock file:-

 

– Finalize  File Name (xyz.lock)

– Finalize  lock file  location

– Before running  application , check lock file is  present or not

if(getlock(lockfile)== 1) :: Lock file is not present

– If lock file is not present  then create  lock file

int lockstatus = setlock(lockfile);

– After  completion  of job  delete lock file

 

Removelock (lockfile)

– If  lock file is  present  then sleep  for  sometime then again check for lock file

– While creating lock file, handle unexpected application closing mechanism to avoid trouble when next time application needs to be run.

 

Note: Extension of lock file is (abc.lock)

Note: Remove lock file when application is forcefully terminating (ctrl + C)

 

(Example of lock file in c using semaphores : –  )

Reference: http://www.c-program-example.com/2012/04/c-program-to-lock-file-using-semaphores.html

Differences between Reporting and Analysis – Concepts

Differences between Reporting and Analysis – Concepts

Definition

       1.The process of orga­niz­ing data into infor­ma­tional sum­maries in order to mon­i­tor how dif­fer­ent areas of a busi­ness are per­form­ing.

Definition

        1.The process of explor­ing data and reports in order to extract mean­ing­ful insights, which can be used to bet­ter under­stand and improve busi­ness performance.

       2.Report­ing trans­lates raw data into infor­ma­tion

        2.Analy­sis trans­forms data and infor­ma­tion into insights.

       3.Good report­ing should raise ques­tions about the busi­ness from its end users

        3.The goal of analy­sis is to answer ques­tions by inter­pret­ing the data at a deeper level and pro­vid­ing action­able rec­om­men­da­tions

      4.Report­ing shows you what is hap­pen­ing

      4.Analy­sis focuses on explain­ing why it is hap­pen­ing and what you can do about it.

Tasks

       5.Reporting: Build­ing, con­fig­ur­ing, con­sol­i­dat­ing, orga­niz­ing, for­mat­ting, and sum­ma­riz­ing.

        5.Analy­sis:  focuses on dif­fer­ent tasks such as ques­tion­ing, exam­in­ing, inter­pret­ing, com­par­ing, and con­firm­ing.

Outputs/Approach

       6.Report­ing fol­lows a push approach

         6.Analy­sis fol­lows a pull approach

       7.Reports are pushed to users who are then expected to extract mean­ing­ful insights and take   appro­pri­ate actions for them­selves (i.e.,     self-serve).


8.Types

Dashboard: These custom-made reports com­bine dif­fer­ent KPIs and reports to pro­vide a com­pre­hen­sive, high-level view of busi­ness per­for­mance for spe­cific audi­ences. Dash­boards may include data from var­i­ous data sources and are also usu­ally fairly static.

7.Par­tic­u­lar data is pulled by an ana­lyst in order to answer spe­cific busi­ness ques­tions.

8.Types
Ad hoc responses
Analy­sis pre­sen­ta­tions

Approach of Reporting and Analysis

image

Pentaho 5.0.1 CE integration with MySQL 5.0.1 CE (Windows or Linux family)

Pentaho 5.0.1 CE integration with MySQL 5.0.1 CE (Windows or Linux )

Parts

  1. Creating databases
  2. Modifying configuration files
  3. Stopping HSQL db start up

Creating databases

Command to execute the scripting files

mysql>source  D:\ biserver-ce\data\mysql5\create_jcr_mysql.sql

Similarly execute the remaining .sql files(i.e, execute create_quartz_mysql.sql and create_repository_mysql.sql)

  1. Check the databases created using show databases command on MySQL command prompt.

 

Modifying configuration files

1. applicationContext-spring-security-hibernate.properties.

Edit the file pentaho-solutions\system\applicationContext-spring-security-hibernate.properties.

Original code

jdbc.driver=org.hsqldb.jdbcDriver

jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.HSQLDialect

Modified code

jdbc.driver=com.mysql.jdbc.Driver

jdbc.url=jdbc:mysql://localhost:3306/hibernate

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.MySQLDialect

  2. hibernate-settings.xml

Edit the file pentaho-solutions\system\hibernate\hibernate-settings.xml.

Original code

<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>

Modified code

<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>

 

3. mysql5.hibernate.cfg.xml

Location of the file: pentaho-solutions\system\hibernate\mysql5.hibernate.cfg.xml

Do need to change any code in this file.. Just need to check everything is perfect or not

<property name="connection.driver_class">com.mysql.jdbc.Driver</property>

<property name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>

<property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>

<property name="connection.username">hibuser</property>

<property name="connection.password">password</property>

4. quartz.properties

Location of the file: pentaho-solutions\system\quartz\quartz.properties

Original Code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

Modified Code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate

5. context.xml

Location of the file: tomcat\webapps\pentaho\META-INF\context.xml

Original Code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="hibuser" password="password"

driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/hibernate"

validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />

 

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="pentaho_user" password="password"

driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/quartz"

validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES"/>

Modified Code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="hibuser" password="password"

driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"

validationQuery="select 1" />

 

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="pentaho_user" password="password"

driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"

validationQuery="select 1"/>

 Imp Note:

Delete pentaho.xml file in below location

tomcat\conf\Catalina\localhost\pentaho.xml

Reason:

Pentaho creates  on startup pentaho.xml as a copy of context.xml

6. repository.xml

Location of the file: pentaho-solutions\system\jackrabbit\repository.xml.

Comment this code means (<! – – every thing here – -> )

Active means: Remove comment

i)                    FileSystem part

Comment this code

<FileSystem>

<param name=”path” value=”${rep.home}/repository”/>

</FileSystem>

Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_repos_"/>

</FileSystem>

ii)                  DataStore part

Comment this code

<DataStore/>

Active this code

<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">

   <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

   <param name="user" value="jcr_user"/>

   <param name="password" value="password"/>

   <param name="databaseType" value="mysql"/>

   <param name="driver" value="com.mysql.jdbc.Driver"/>

   <param name="minRecordLength" value="1024"/>

   <param name="maxConnections" value="3"/>

   <param name="copyWhenReading" value="true"/>

   <param name="tablePrefix" value=""/>

   <param name="schemaObjectPrefix" value="ds_repos_"/>

 </DataStore>

iii)                Security part in the FileSystem Workspace part

Comment this code

<FileSystem>

<param name=”path” value=”${wsp.home}”/>

</FileSystem>

   Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_ws_"/>

</FileSystem>

iv)                PersistenceManager part

Comment this code

<PersistenceManager>

<param name=”url” value=”jdbc:h2:${wsp.home}/db”/>

<param name=”schemaObjectPrefix” value=”${wsp.name}_”/>

</PersistenceManager>

Active this code

<PersistenceManager>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user" />

<param name="password" value="password" />

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="${wsp.name}_pm_ws_"/>

</PersistenceManager>

v)                  FileSystem Versioning part

Comment this code

<FileSystem>

<param name=”path” value=”${rep.home}/version” />

</FileSystem>

 

Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_ver_"/>

</FileSystem>

vi)                PersistenceManager Versioning part

 

Comment this code:

 

<PersistenceManager>

<param name=”url” value=”jdbc:h2:${rep.home}/version/db”/>

<param name=”schemaObjectPrefix” value=”version_”/>

</PersistenceManager>

Active this code:

<PersistenceManager>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user" />

<param name="password" value="password" />

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="pm_ver_"/>

</PersistenceManager>

Stopping HSQL db start up

In web.xml file

Comment or delete this code (Commenting is preferable)

<!– [BEGIN HSQLDB DATABASES] –>

<context-param>

<param-name>hsqldb-databases</param-name>

<param-value>[email protected]/../data/hsqldb/sampledata,[email protected]/../data/hsqldb/hibernate,[email protected]/../data/hsqldb/quartz</param-value>

</context-param>

<!– [END HSQLDB DATABASES] –>

 

Also comment this code

<!– [BEGIN HSQLDB STARTER] –>

<listener>

<listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class>

</listener>

<!– [END HSQLDB STARTER] –>

 

 

You have done with integrating pentaho 5.0.1 CE with Mysql 5.5

Now login to the Pentaho server .

URL:  http://localhost:8080/pentaho

Username/Password : Admin/password

NOTE:

  • You will not find any sample working. B’z you have not installed sample data.
  • Example available in pentaho are developed on sample data so you need to execute and give the new sample data base connections(for this you need to execute .sql file of sample data).

Helical IT Solutions