Map Reduce In MongoDB

MongoDb Map Reduce

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results.

For map-reduce operations, MongoDB provides the map Reduce database command.

The mapReduce command allows you to run map-reduce aggregation operations over a collection. The mapReduce command has the following prototype form:


db.runCommand(
     {
               mapReduce: <collection>,
               map: <function>,
               reduce: <function>,
               finalize: <function>,
               out: <output>,
               query: <document>,
               sort: <document>,
               limit: <number>,
               scope: <document>,
               verbose: <boolean>
     }
)

 

Pass the name of the collection to the mapReduce command (i.e. <collection>) to use as the source documents to perform the map reduce operation.

The command also accepts the following parameters:

Field Description
mapReduce The name of the collection on which you want to perform map-reduce. This collection will be filtered using query before being processed by the map function.
map A JavaScript function that associates or “maps” a value with a key and emits the key and value pair.
reduce A JavaScript function that “reduces” to a single object all the values associated with a particular key.
out Specifies where to output the result of the map-reduce operation. You can either output to a collection or return the result inline.
query Optional. Specifies the selection criteria using query operators for determining the documents input to the map function.
sort Optional. Sorts the input documents. This option is useful for optimization. For example, specify the sort key to be the same as the emit key so that there are fewer reduce operations. The sort key must be in an existing index for this collection.
limit Optional. Specifies a maximum number of documents for the input into the map function.
finalize Optional. Follows the reduce method and modifies the output.
scope Optional. Specifies global variables that are accessible in the map, reduce and finalize functions.
verbose Optional. Specifies whether to include the timing information in the result information. The verbose defaults to true to include the timing information.

 

The following is a prototype usage of the mapReduce command:

var mapFunction = function() { ... };
var reduceFunction = function(key, values) { ... };
db.runCommand(
{
      mapReduce: <input-collection>,
      map: mapFunction,
      reduce: reduceFunction,
      out: { merge: <output-collection> },
      query: <query>
}
)

Requirement for map function:
Map function is responsible for transforming each input document in to zero or more documents.It can access the variables defined in the scope parameter,and has following prototypes.

function(){
     ...
     emit(key,value);
}

The map function has the following requirements:

  • In the map function, reference the current document as this within the function.
  • The map function should not access the database for any reason.
  • The map function should be pure, or have no impact outside of the function (i.e. side effects.)
  • A single emit can only hold half of MongoDB..
  • The map function may optionally call emit(key,value) any number of times to create an output document associating key with value.

The following map function will call emit(key,value) either 0 or 1 times depending on the value of the input document’s status field:

function(){   
   if(this.status=='A')       
      emit(this.cust_id,1);
}

The following map function may call emit(key,value) multiple times depending on the number of elements in the input document’s items field:

function(){this.items.forEach(function(item){emit(item.sku,1);});}

Requirements for the Reduce Function
The reduce function has the following prototype:


     function(key,values){
         ...
         return result;
}

The reduce function exhibits the following behaviors:

  • The reduce function should not access the database, even to perform read operations.
  • The reduce function should not affect the outside system.
  • MongoDB will not call the reduce function for a key that has only a single value. The valuesargument is an array whose elements are the value objects that are “mapped” to the key.
  • MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.
  • The reduce function can access the variables defined in the scope parameter.
  • The inputs to reduce must not be larger than half of MongoDB’s. This requirement may be violated when large documents are returned and then joined together in subsequent reduce steps.

Because it is possible to invoke the reduce function more than once for the same key, the following properties need to be true:

  • the type of the return object must be identical to the type of the value emitted by the mapfunction.
  • the reduce function must be associative. The following statement must be true:
reduce(key,[C,reduce(key,[A,B])])==reduce(key,[C,A,B])
    • the reduce function must be idempotent. Ensure that the following statement is true:
reduce(key,[reduce(key,valuesArray)])==reduce(key,valuesArray)
    • the reduce function should be commutative: that is, the order of the elements in thevaluesArray should not affect the output of the reduce function, so that the following statement is true:
reduce(key,[A,B])==reduce(key,[B,A])

Requirements for the finalize Function

The finalize function has the following prototype:

 function(key,reducedValue){
          ...
          return modifiedObject;
}

The finalize function receives as its arguments a key value and the reducedValue from thereduce function. Be aware that:

  • The finalize function should not access the database for any reason.
  • The finalize function should be pure, or have no impact outside of the function (i.e. side effects.)
  • The finalize function can access the variables defined in the scope parameter.

out Options

You can specify the following options for the out parameter:

Output to a Collection

This option outputs to a new collection, and is not available on secondary members of replica sets.

out:<collectionName>

Map-Reduce Examples:

Consider two Collection (tables) named :

  • Employee
  • Department

Now , to create collection in mongo db , use below query

db.createCollection(“Employee”)
db.createCollection(“Department”)

To insert data in Employee Collection :

db.Employee.insert({“name” : { “first” : “John”, “last” : “Backus” }, “city” : “Hyd”,“department” : 1})

db.Employee.insert({“name” : { “first” : “Merry”, “last” : “Desuja” }, “city” : “Pune”,“department” : 2})

To insert data in Department Collection :

db.Department.insert({“_id” : 1,   “department” : “Manager”})

db.Department.insert({“_id” : 2,   “department” : “Accountant”})

Now the requirement is to display FirstName , LastName , DepartmentName.

For this , we need to use Map Reduce :

Create two map functions for both the collections.

//map function for Employee
var mapEmployee = function () {
var output= {departmentid : this.department,firstname:this.name.first, lastname:this.name.last , department:null}
     emit(this.department, output);               
};

//map function for Department
var mapDepartment = function () {
var output= {departmentid : this._id,firstname:null, lastname:null , department:this.department}
     emit(this._id, output);              
 };

Write Reduce Logic to display the required fields :



var reduceF = function(key, values) {

var outs = {firstname:null, lastname:null , department:null};

values.forEach(function(v){
      if(outs.firstname ==null){outs.firstname = v.firstname }                   
      if(outs.lastname ==null){outs.lastname = v.lastname    }
      if(outs.department ==null){ outs.department = v.department }                         
 });   
 return outs;
};

Store the result into a different collection called emp_dept_test


result = db.employee_test.mapReduce(mapEmployee, reduceF, {out: {reduce: ‘emp_dept_test’}}) 
result = db.department_test.mapReduce(mapDepartment,reduceF, {out: {reduce: ‘emp_dept_test’}})

write the following command to get combined result:

db.emp_dept_test.find()

Output of the query gives the combined result like


{
    "_id" : 1,
    "value" : {
        "firstname" : "John",
        "lastname" : "Backus",
        "department" : "Manager"
    }
}

/* 1 */
{
    "_id" : 2,
    "value" : {
        "firstname" : "Merry",
        "lastname" : "Desuja",
        "department" : "Accountant"
    }
}

 

-By
Nitin Uttarwar
Helical It Solution

Map Reduce in Mongo db :

Map Reduce in Mongo db :

This Blog will teach you, how to write Map reduce in Mongo DB .

Map Reduce is a concept that process large volume of data into aggregated results.

To use Map Reduce Concept in Mongo DB , create one command called “mapReduce”.

This mapReduce() function fetch data from collection (table) and then produce the result set into (key, value) format.

Then reduce () function takes the (key, value) pair and reduce all the data (documents) on the same key.

Eg : – Let say I have two Collection (tables) named :

  1. Emp_test
  2. Dept_Test

Now , to create collection in mongo db , use below query

db.createCollection(“Emp_test”)

db.createCollection(“Dept_Test”)

To insert data in Emp_test Collection :

db.Emp_test.insert({“name” : {       “first” : “ABC”,       “last” : “DEF”   },   “city” : “Hyd”,   “department” : 1})

db.Emp_test.insert({“name” : {       “first” : “GHI”,       “last” : “JKL”   },   “city” : “Pune”,   “department” : 2})

To insert data in Dept_Test Collection :

db.Dept_Test.insert({“_id” : 1,   “department” : “SALESMAN”})

db.Dept_Test.insert({“_id” : 2,   “department” : “CLERK”})

Now the requirement is to display FirstName , LastName , DepartmentName.

For this , we need to use Map Reduce :

# 1 : Create two map functions for both the collections.

var mapEmp_test = function () {

var output= {departmentid : this.department,firstname:this.name.first, lastname:this.name.last , department:null}

emit(this.department, output);               };

var mapDept_Test = function () {

var output= {departmentid : this._id,firstname:null, lastname:null , department:this.department}

emit(this._id, output);               };

Write Reduce Logic to display the required fields :

var reduceF = function(key, values) {

var outs = {firstname:null, lastname:null , department:null};

values.forEach(function(v){

if(outs.firstname ==null){                       outs.firstname = v.firstname                   }                   if(outs.lastname ==null){                       outs.lastname = v.lastname                   }                   if(outs.department ==null){                       outs.department = v.department                   }                          });   return outs;};

# 3 : Store the result into a different collection called emp_dept_test

result = db.employee_test.mapReduce(mapEmployee, reduceF, {out: {reduce: ’emp_dept_test’}}) result = db.department_test.mapReduce(mapDepartment,reduceF, {out: {reduce: ’emp_dept_test’}})

# 4: write the following commanddb.emp_dept_test.find()

Best Practices when designing & using iReport /Jaspersoft

This blog talks about the best practices which should be followed when creating reports using iReport or Jasper studio, deploying the same on Jaspersoft server, nomenclature to be used etc.

 

1) Report Margins:

When you develop reports for dashboards, it is advisable to keep all the margins with 0 pixels.

By default margins will be
Left margin         20
Right margin       20
Top margin         20
Bottom margin    20

Change the values to 0

Left margin         0
Right margin       0
Top margin         0
Bottom margin  0

Why?
Because, when set to 0 report panels are well fit when designing the dashboards.

 

2) Bands to keep the components

Do not keep table component, cross tab component in Detail band. Keep all the components either in Title band or in Summary Band as per the requirement. It is advisable to create custom bands to keep the different charts if you need to develop a report with multiple charts.

Why it is not recommended to keep the components in Detail band?

Details band falls into loop till the end of the row/data for fields hence if you keep any other component it will fall in a loop and will give you unexpected behaviour of iReport with bad output.

3) Parameter Naming conventions

It is advisable to give good naming conventions for parameters. For example parameter name could be param_paramName or p_paramName

Eg : 1)  p_startDate 2) p_endDate

Other Naming conventions
The same thing you can apply when you create input controls /Data source Names/Custom band names/Data Set names in iReport & Jasper Repository respectively.

Why ?

Easy to differentiate the variables, parameters and group names etc

4) Remove the other bands which you are not going to use in iReports

5) Variables and Parameter usage in iReport

Make use of internal parameters for the report and for the summation of columns recommended to use the variables.

6) Jasper Project Folder Structure

Project Name

archive (take a back up of jrxmls if you are going to update/modify them in this folder with a version number)

      Resources
     Input Controls ( All your parameter names for the project/various reports)
      Data sources(This folder is useful when you have multiple databases to use in your project)
      Files(Keep all your data source files here, for eg : Excel, CSV, XML and etc)
      JRXML’s (Whatever the JRXMLS you are creating you can keep all of them in this folder)
      Sub Reports(keep all you sub reports in this folder and refer from here where ever you want)
      Images(Keep all your images in this folder- for easy understanding)
      Reports(Keep all your reports in this folder)
      Dashboards(Save all your dashboards here)
      Temp (for temporary files)
      Test (Do experiment at the time of development of report in this folder)

Note that if your project is having lot many reports according to some sections/departments, it is advisable to sub divide the Reports folder with other folders.
For example:
Reports
    A.Department-1
         1.Report Name
         2.Report Name
    B.Department-2
        1.Report Name
         2.Report Name

 NOTE:

When you upload a report to JRXML it is recommended to write the description of the report. By seeing it every one can easily understand the purpose of the report/visualization.

7) Export / Import Utility

Command line utility to import/export/update folders/reports from the jasper server is given below.

Importing
js-import –input-zip(space) <Filename>
Ex: js-import –input-zip(space)”E:Work Space\Unified\Unified Reports\<file name>”

Updating
js-import –input-zip(space)”E:Work Space\Unified\Unified Reports\<file name>” –update

Exporting
js-export <location of the folder in jasper server> –output-zip <location of exporting folder>/<exporting_filename.zip>

8) Bands

Title band:

·         Every report must have some name, give the name of the report in this band.

·         Blue colour back ground with white colour font is preferable to give the titles.

·         Logos of the company are recommended to be placed left side of the band in title band under the title of the report.

Page Header:

·         Page header consists of the page numbers and date type of information. It is recommended to give page header information for long reports with heavy text involved in the reports.

Column Header:

·         This band is used for giving column headers for the fields. You can change the font style, size, give the borders, back ground colours and etc.

Detail band:

·         Detail band is used to display the output of the report using fields fetched by the query.

·         You need to drag and drop required fields to create the report to Detail band format them accordingly.

·         Detail band falls into for loop so we should keep only fields in this band rather than keeping any other component like table , cross tab, chart components.

Column footer:

 ·         This band is used to find the total, max, min of the columns from the details band.

·         You need to create variable for this and drag that variables under the column where you want see the sum, max or min

Page footer:

·         Page footer is used to place the page numbers, confidential type of text for the company etc.

Summary:

·         Summary of the report will be placed in the summary band.

·         Generally we keep the chart component, table component, cross tab component to summarize the report.

9)   9) Why should we keep input controls and data sources in resource folder?

  

Input controls in repository:

Create all your input controls in resource folder because every time for each report you need not to create the same input controls. You just need to link the existing input control from the repository folder.

Data sources in repository:
It is considered as a best practice to create data source connections in a folder called resources  and use this data source for the reports. It’ll reduce the report development time. You need to not create import database connections from iReport once you create this connection in the repository.

For any Jaspersoft, ireport, jasper studio, jasperserver or Open source DWBI requirement, please get in touch : [email protected], www.helicaltech.com

Pentaho 5.0.1 CE integration with MySQL 5.0.1 CE (Windows or Linux family)

Pentaho 5.0.1 CE integration with MySQL 5.0.1 CE (Windows or Linux )

Parts

  1. Creating databases
  2. Modifying configuration files
  3. Stopping HSQL db start up

Creating databases

Command to execute the scripting files

mysql>source  D:\ biserver-ce\data\mysql5\create_jcr_mysql.sql

Similarly execute the remaining .sql files(i.e, execute create_quartz_mysql.sql and create_repository_mysql.sql)

  1. Check the databases created using show databases command on MySQL command prompt.

 

Modifying configuration files

1. applicationContext-spring-security-hibernate.properties.

Edit the file pentaho-solutions\system\applicationContext-spring-security-hibernate.properties.

Original code

jdbc.driver=org.hsqldb.jdbcDriver

jdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernate

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.HSQLDialect

Modified code

jdbc.driver=com.mysql.jdbc.Driver

jdbc.url=jdbc:mysql://localhost:3306/hibernate

jdbc.username=hibuser

jdbc.password=password

hibernate.dialect=org.hibernate.dialect.MySQLDialect

  2. hibernate-settings.xml

Edit the file pentaho-solutions\system\hibernate\hibernate-settings.xml.

Original code

<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>

Modified code

<config-file>system/hibernate/mysql5.hibernate.cfg.xml</config-file>

 

3. mysql5.hibernate.cfg.xml

Location of the file: pentaho-solutions\system\hibernate\mysql5.hibernate.cfg.xml

Do need to change any code in this file.. Just need to check everything is perfect or not

<property name="connection.driver_class">com.mysql.jdbc.Driver</property>

<property name="connection.url">jdbc:mysql://localhost:3306/hibernate</property>

<property name="dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>

<property name="connection.username">hibuser</property>

<property name="connection.password">password</property>

4. quartz.properties

Location of the file: pentaho-solutions\system\quartz\quartz.properties

Original Code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.PostgreSQLDelegate

Modified Code

org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate

5. context.xml

Location of the file: tomcat\webapps\pentaho\META-INF\context.xml

Original Code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="hibuser" password="password"

driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/hibernate"

validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES" />

 

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="pentaho_user" password="password"

driverClassName="org.hsqldb.jdbcDriver" url="jdbc:hsqldb:hsql://localhost/quartz"

validationQuery="select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES"/>

Modified Code

<Resource name="jdbc/Hibernate" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="hibuser" password="password"

driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hibernate"

validationQuery="select 1" />

 

<Resource name="jdbc/Quartz" auth="Container" type="javax.sql.DataSource"

factory="org.apache.commons.dbcp.BasicDataSourceFactory" maxActive="20" maxIdle="5"

maxWait="10000" username="pentaho_user" password="password"

driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/quartz"

validationQuery="select 1"/>

 Imp Note:

Delete pentaho.xml file in below location

tomcat\conf\Catalina\localhost\pentaho.xml

Reason:

Pentaho creates  on startup pentaho.xml as a copy of context.xml

6. repository.xml

Location of the file: pentaho-solutions\system\jackrabbit\repository.xml.

Comment this code means (<! – – every thing here – -> )

Active means: Remove comment

i)                    FileSystem part

Comment this code

<FileSystem>

<param name=”path” value=”${rep.home}/repository”/>

</FileSystem>

Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_repos_"/>

</FileSystem>

ii)                  DataStore part

Comment this code

<DataStore/>

Active this code

<DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">

   <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

   <param name="user" value="jcr_user"/>

   <param name="password" value="password"/>

   <param name="databaseType" value="mysql"/>

   <param name="driver" value="com.mysql.jdbc.Driver"/>

   <param name="minRecordLength" value="1024"/>

   <param name="maxConnections" value="3"/>

   <param name="copyWhenReading" value="true"/>

   <param name="tablePrefix" value=""/>

   <param name="schemaObjectPrefix" value="ds_repos_"/>

 </DataStore>

iii)                Security part in the FileSystem Workspace part

Comment this code

<FileSystem>

<param name=”path” value=”${wsp.home}”/>

</FileSystem>

   Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_ws_"/>

</FileSystem>

iv)                PersistenceManager part

Comment this code

<PersistenceManager>

<param name=”url” value=”jdbc:h2:${wsp.home}/db”/>

<param name=”schemaObjectPrefix” value=”${wsp.name}_”/>

</PersistenceManager>

Active this code

<PersistenceManager>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user" />

<param name="password" value="password" />

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="${wsp.name}_pm_ws_"/>

</PersistenceManager>

v)                  FileSystem Versioning part

Comment this code

<FileSystem>

<param name=”path” value=”${rep.home}/version” />

</FileSystem>

 

Active this code

<FileSystem>

<param name="driver" value="com.mysql.jdbc.Driver"/>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user"/>

<param name="password" value="password"/>

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="fs_ver_"/>

</FileSystem>

vi)                PersistenceManager Versioning part

 

Comment this code:

 

<PersistenceManager>

<param name=”url” value=”jdbc:h2:${rep.home}/version/db”/>

<param name=”schemaObjectPrefix” value=”version_”/>

</PersistenceManager>

Active this code:

<PersistenceManager>

<param name="url" value="jdbc:mysql://localhost:3306/jackrabbit"/>

<param name="user" value="jcr_user" />

<param name="password" value="password" />

<param name="schema" value="mysql"/>

<param name="schemaObjectPrefix" value="pm_ver_"/>

</PersistenceManager>

Stopping HSQL db start up

In web.xml file

Comment or delete this code (Commenting is preferable)

<!– [BEGIN HSQLDB DATABASES] –>

<context-param>

<param-name>hsqldb-databases</param-name>

<param-value>[email protected]/../data/hsqldb/sampledata,[email protected]/../data/hsqldb/hibernate,[email protected]/../data/hsqldb/quartz</param-value>

</context-param>

<!– [END HSQLDB DATABASES] –>

 

Also comment this code

<!– [BEGIN HSQLDB STARTER] –>

<listener>

<listener-class>org.pentaho.platform.web.http.context.HsqldbStartupListener</listener-class>

</listener>

<!– [END HSQLDB STARTER] –>

 

 

You have done with integrating pentaho 5.0.1 CE with Mysql 5.5

Now login to the Pentaho server .

URL:  http://localhost:8080/pentaho

Username/Password : Admin/password

NOTE:

  • You will not find any sample working. B’z you have not installed sample data.
  • Example available in pentaho are developed on sample data so you need to execute and give the new sample data base connections(for this you need to execute .sql file of sample data).

Helical IT Solutions

Inter panel communication in pentaho CDE

Inter panel communication in pentaho CDE

In this post you will learn how to communicate with the other panels in the same dashboard.

Example Scenario :
Panel 1 : BarChart is placed in panel-1
Panel 2: Table data displayed in panel-2

Functionality : The ideal functionality expected is when we click on any of the bars (panel 1), then the panel 2 should get updated and corresponding data should be displayed in 2nd panel. i.e., Make use of parameters, clickAction property, Listeners.

Example Developed on :
Pentaho 5.0 CE server
Pentaho C-Tools version : 13.09
Database : postgresql-foodmart( jasper food mart database)

Steps:
1) Create parameter (Lets say : param1_position_title)
2) Add it to listeners & parameters in your table component.
3) Add it to parameters in your data source for table query
Example:
SELECT
employee_id,
full_name,
position_id,
department_id
FROM
employee
WHERE
position_title=${param1_position_title}

4) Add to parameters in your chart(bar chart).

5)  BarChart query example:

SELECT DISTINCT
position_title AS position,
sum(salary) AS salary
FROM employee
GROUP BY position_title

6) In the chart properties
clickable = True
clickAction
function fun()
{
Dashboards.fireChange(‘param1_position_title’, this.scene.atoms.category.value);
}

Get in touch with us for any Pentaho related consultation, query

Pentaho C-Tools Manul installation in CE 4.8 and 5.0 – jar files installation.

Hi guys..
Some times we can NOT access Pentaho Market place due to the proxy settings enabled…  I was trying it and unable to fix it .. will have a deep look into it… but alternatively installed Pentaho-C-Tools manually.

Find the bug link below.

http://forums.pentaho.com/showthread.php?153500-Issue-Pentaho-Market-Place-in-4-8-server-Not-connecting-to-server-amp-5-0-install

There was some problem with proxy internet access . I could not figure it out exactly … so thought of installing C-Tools manually…
hence have given a trail on installing C-Tools in Pentaho CE 4.8 and installed successfully.

Manual installation of C-Tools in 4.8 CE
Steps :
1) Down load C-Tools from web-details site

CDA link : http://www.webdetails.pt/ctools/cda.html
CDE link : http://www.webdetails.pt/ctools/cde.html
CDF link : http://www.webdetails.pt/ctools/cdf.html

2) These are executable jar file, you need to click on them as generally we install windows based software.

3) importantly you need to locate the installation path..
Generally give your pentaho-solutions path

4) Now,start the server and find the CDE icon on Tools bars of pentaho.

Note that , you may not get all the CDE example in this way of installation …. I need to figure out by which way we need to install samples..

Manual installation of C-Tools in 5.0 CE
I have not tried yet this installation.. but you can refere Pedro’s blog to install the C-Tools manually in BA server.

http://pedroalves-bi.blogspot.pt/2013/11/ctools-for-pentaho-50-is-available-cdf.html

Extract year,quarter,month & day from date input control in pentaho CDE using java script – MDX Query Scenario

Hello guys…!!

Some times you need to extract the parts(year,month,day) of Date for some specific use..
For example:
Assume you are creating a report with MDX query which has dimension called “Date” having levels “Year”, “Quarter”,”Month” & “Day”.

(Note : Assume your schema is having
Year: yyyy Quarter: 1 or 2 or 3 or 4 Month : 1,2,3,4…….12  Day : 1,2,3…. 31)

Also assume you do not have direct date dimension in your schema (i.e, you do not have a dimension which takes ‘yyyy-MMM-dd’ column.

But, you need to display date(yyyy-QQ-MMM-dd) or (yyyy-MMM-dd) on X-axis of chart.. Remember you are not having any direct date in your schema but have “Date” with year,quarter,month & day as levels.

From your start_date(or end_date) input control you can extract the individuals using the following java script for CDE and use them properly in your MDX date range place.

This should be done in “Pre Execution” section of Chart component

function extract_function(){

tmp_date = new Date(param_start_date);
var quarter = [‘Q1′,’Q2′,’Q3′,’Q4’];
var month = [‘JAN’,’FEB’,’MAR’,’APR’,’MAY’,’JUN’,’JUL’,’AUG’,’SEP’,’OCT’,’NOV’,’DEC’];

param_start_year = tmp_date.getFullYear();
param_start_quarter = q[Math.floor((tmp_date.getMonth()+ 3) / 3)-1];
param_start_month = m[tmp_date.getMonth()];
param_start_day = tmp_date.getDate();

tmp_date = new Date(param_end_date);
param_end_year = tmp_date.getFullYear();
param_end_quarter = q[Math.floor((tmp_date.getMonth()+ 3) / 3)-1];
param_end_month = m[tmp_date.getMonth()];
param_end_day = tmp_date.getDate();

}

NOTE: 
* quarter and month variables are taken as arrays with default values.
* You need to calculate the month and send it to array ..
When you calculate months
1 becomes JAN, 2 becomes FEB and etc as well
When you calculate quarters
1 becomes Q1, 2 becomes Q2, 3 becomes Q3 & 4 becomes Q4

Forget about your problems …!!!! and Meet us @   http://www.helicaltech.com/contact.php

Sadakar

BI developer

 

Pentaho BI Server community 4.8 Installation in existing tomcat with PostgreSQL in Linux/ubantu OS

This post teach you how to install the Pentaho BI Server community 4.8 Installation in existing tomcat with PostgreSQL in Linux/ubantu OS.

I’ve gone through many posts but could not find all the stuff in a single place. I just worked out and sharing the experience with it.
If you find any difficulty in the below steps feel free to drop a mail @ [email protected] for help.

Prerequisites :
1. Pentaho BI server CE 4.8.0 stable
2. tomat 6 server
3. PostgreSQL
4.PuTTY/WinScp

1)  Download the biserver-stable-4.8.0 using the following command in some folder.
Syntax :
wget URLOfTheDownloadLocation
Example:
wget http://sourceforge.net/projects/pentaho/files/Business%20Intelligence%20Server/4.8.0-stable  /biserver-ce-4.8.0-stable.zip

2) After downloading completed unzip it using uznip command.
Syntaz :
unzip .zipfileName
Example:
unzip  biserver-ce-4.8.0-stable.zip
After unzipping you can find two folders .. They are i)administration-console & ii) biserver-ce

3) Install tomcat server externally(archive based installation) in your favorite location.
Example:
I’m taking jasperserver tomcat to install the pentaho server.
[email protected]:/opt/jasperreports-server-cp-5.0.0/apache-tomcat#

4) Executing .sql files in postgresSQL
* You need to build the two databases they are i) hibernate & quartz
*  Reason: You are going to install the pentaho bi server with postgreSQL(not with the hsql which directly comes with the download to interact with the server),hence you need to build the two databases for pentaho server to work properly.
* Where you can find the .sql scripting files ?
Check in the location :

/biserver-ce/data/postgresql
( biserver-ce is the folder where you unzipped in step-2)

Scripting file names:

create_quartz_postgresql.sql
create_repository_postgresql.sql
create_sample_datasource_postgresql.sql
migrate_quartz_postgresql.sql
migration.sql
Commands to run the .sql files from putty :

[email protected]:/opt/jasperreports-server-cp-5.0.0/postgresql/bin# ./psql -U postgres -p 5432 -a -f /home/sadakar/softwares/pentaho/biserver-ce/data/postgresql/
create_quartz_postgresql.sql

In the similar way execute the remaining scripting files.. you just need to change the file name in the above command.

Imp points to NOTE when you run the script files * You need to go to the “bin” folder of postgres installed and run the above command.
* In my case I’m using the the postgresql that installed with jasper server.
* In the above command -U user name -p Port number of the postgreSQL
* Must specify   -a -f  in the command otherwise the script will not run.
* When you run the script it’ll ask you for postgreSQL password : give password as “password”.
If you use any other password for postgres give that password
* When you run the script it’ll ask for database user names :
Open the script files in your fav editor and find this line

CREATE USER pentaho_user PASSWORD ‘password’;
This means for the quartz database password is “password” and for the same follows to other scripting files while executing.

 NOTE:
* Once you execute all the scripting files check the postgreSQL databases whether the “hibernate” and “quartz” databases created or not.
* If you do not find the databases you might done wrong some where , cross check again the steps.
* And find 12 tables in “quartz” database and 1 table in “hiberante”database.

Hmmm… You are not done with the databases actually… b’z you do not have all the tables in “hibernate” database. B’z the scripting files do not have all the data & tables.

I’ll give you the links here to run the scripting files to get the tables.
At present do not think of it. Find this in following steps…!!

5. Changes in config.xml file of tomcat server
* You need to add the following code to the config.xml file
* location of the file : tomcat/confg/context.xml
* In my case the location is :
[email protected]:/opt/jasperreports-server-cp-5.0.0/apache-tomcat/conf#
<contex>
<WatchedResource>WEB-INF/web.xml</WatchedResource>

<Resource name=”jdbc/Hibernate” auth=”Container” type=”javax.sql.DataSource”
factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″
maxWait=”10000″ username=”hibuser” password=”password”
driverClassName=”org.postgresql.Driver” url=”jdbc:postgresql://localhost:5432/hibernate”
validationQuery=”select 1″ />

<Resource name=”jdbc/Quartz” auth=”Container” type=”javax.sql.DataSource”
factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″
maxWait=”10000″ username=”pentaho_user” password=”password”
driverClassName=”org.postgresql.Driver” url=”jdbc:postgresql://localhost:5432/quartz”
validationQuery=”select 1″/></Context> 
6. Adding postgresql-driver in the lib folder of tomcat
* You need to copy the postgresql-driver in the lib folder of tomcat
* location of the lib folder for tomcat is :  tomcat/lib
* In my example it is there at
[email protected]:/opt/jasperreports-server-cp-5.0.0/apache-tomcat/lib# 
* You can directly download the postgresql driver using the following command or copy and paste it in lib folder if you are already using in some other place in your machine.
* Command is :
wget jdbc.postgresql.org/download/postgresql-9.2-1003.jdbc4.jar
 
7. Changes need to do inside pentaho-solutions folder

* This is quite interesting thing to work here.
* Before you do modifications in pentaho-solutions folder, you need to copy this folder similar to tomcat installation location( You can keep this folder any where you want).
* For example : I have copied this folder from bi-server folder to similar location where the tomcat is installed .(from step 2 of this artical)
i.e.,  At [email protected]:/opt/jasperreports-server-cp-5.0.0# ls

apache-ant     common        installation.log  license.txt        properties.ini    scripts                  uninstall.dat
apache-tomcat  ctlscript.sh  java              pentaho-solutions  releaseNotes.txt  Third-Party-Notices.pdf
buildomatic    docs          licenses          postgresql         samples           uninstall

* You need to configure the the settings for postgresql in applicationContext-spring-security-jdbc.xml file
* location of this file is : pentaho-solutions/system/applicationContext-spring-security-jdbc.xml

<bean id=”dataSource”
>
<property name=”driverClassName” value=”org.postgresql.Driver” />
<property name=”url”
value=”jdbc:postgresql://localhost:5432/hibernate” />
<property name=”username” value=”hibuser” />
<property name=”password” value=”password” />
</bean>

* Next, you need to configure setting in : applicationContext-spring-security-hibernate.properties
* location of this file is : pentaho-solutions/system/applicationContext-spring-security-hibernate.properties

jdbc.driver=org.postgresql.Driver
jdbc.url=jdbc:postgresql://localhost:5432/hibernate
jdbc.username=hibuser
jdbc.password=password
hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect

8. Changes need to do in hibernate folder
Navigate to “hibernate” folder from “system” folder of same “pentaho-solutios” folder.
* You’ll find different .xml files for different databases.
* You need to touch
i) hibernate-settings.xml and
ii) postgresql.hibernate.cfg.xml          files.. i.e., you need to do some modifications in these two files.
Changes in :
i) hibernate-settings.xml file
Comment this line
<config-file>system/hibernate/hsql.hibernate.cfg.xml</config-file>

Enable this line
<config-file>system/hibernate/postgresql.hibernate.cfg.xml</config-file>

ii) postgresql.hibernate.cfg.xml

* You need not to do any modifications in this but you need to have an eye in this file.
if your postgresql port number is different than 5432 , just give your one and if you give the appropriate host if you use any host number .

9. Changes in context.xml file of META-INF folder of tomcat
* You need to modify the “context.xml” file located in the tomcat/webapps/pentaho/META-INF folder.
* In my example: It is located at

[email protected]:/opt/jasperreports-server-cp-5.0.0/apache-tomcat/webapps/pentaho/META-INF#

<Context path=”/pentaho” docbase=”webapps/pentaho/”>
<Resource name=”jdbc/Hibernate” auth=”Container” type=”javax.sql.DataSource”
factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″
maxWait=”10000″ username=”hibuser” password=”password”
driverClassName=”org.postgresql.Driver” url=”jdbc:postgresql//localhost:5432/hibernate”
validationQuery=”select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES” />

<Resource name=”jdbc/Quartz” auth=”Container” type=”javax.sql.DataSource”
factory=”org.apache.commons.dbcp.BasicDataSourceFactory” maxActive=”20″ maxIdle=”5″
maxWait=”10000″ username=”pentaho_user” password=”password”
driverClassName=”org.postgresql.Driver” url=”jdbc:postgresql://localhost:5432/quartz”
validationQuery=”select count(*) from INFORMATION_SCHEMA.SYSTEM_SEQUENCES”/>
</Context>

NOTE: We deployed “pentaho” and “pentaho-style” folders in weapps folder of tomcat server.

10. Changes in web.xml file of WEB-INF folder of tomcat
You need to modify web.xml of WEB-INF folder of tomcat server. i.e,. tomcat/webapps/pentaho/WEB-INF/web.xml
* In my example the location of the file is :
[email protected]:/opt/jasperreports-server-cp-5.0.0/apache-tomcat/webapps/pentaho/WEB-INF#

<context-param>
<param-name>solution-path</param-name>
<param-value>/opt/jasperreports-server-cp-5.0.0/pentaho-solutions</param-value>
</context-param>

 NOTE: give the path for the “pentaho-solutios” b/w <param-value> and </param-value> tags

* You also need to check the port number & URL for the pentaho server in the same web.xml file.
<context-param>
<param-name>fully-qualified-server-url</param-name>
<param-value>http://localhost:9090/pentaho/</param-value>
</context-param>

NOTE: if you use some other port number for tomcat other than 8080 , you must specify the port number as shown above.

11.Tomcat server shutdown & startup
*  Go to the bin folder of tomcat server and shutdown the server if it already runs.
* Start the tomcat server.
* Commands :
Shutdown: ./shutdown.sh
Startup :   ./startup.sh

12. Type the pentaho server URL in any browser
* Go to the URL of any browser( Mozilla firefox is preferable as it is having firebug facility to track the errors if you get any)

Meet us if you have a business @ http://www.helicaltech.com/contact.php

Sadakar(BI developer)