Tuesday, September 21, 2010

Liquid Pectin Where To Buy

format PMML models for the deployment of Pentaho Data Integration

deployment models is a significant step in data mining. Under supervised learning, it is to make predictions by applying the models on unlabeled observations. We have repeatedly described the procedure for different tools (eg Tanagra, Sipina , Spad , or R). They have in common is to use the same software for the construction of the model and its deployment. This new tutorial

differ from earlier in that we use third party software for classifying new observations. It follows a remark made to me by LUCELLE Loïc (Loïc thank you very much for your valuable information), it made me realize two things: the deployment gave his full measure when it is realized with a tool dedicated to data management, we take the example of EC-PDI (Kettle), we reach a certain universality when we describe models using standards recognized / accepted by the majority of software, namely the PMML standard description.

I had already spoken several times to PMML. But so far, I do not see too much interest if we do not have a downstream tool capable of apprehending a generic way. In this tutorial we will see that it is possible to develop a decision tree with different tools (SIPINA, and KNIME RapidMiner), export PMML within standard, and deploy them indiscriminately on observations unlabeled via PDI-EC. Adopting a standard model description becomes particularly interesting in this case. Just

the margin of our discussion, we describe solutions deployment alternatives in this tutorial. We will see that Knime has its own interpreter PMML. It is capable of applying a model on new data, whatever the tool used for model development. The key is that the PMML standard is met. In this sense, Knime can substitute for PDI-EC. Another possible route, Weka, which is part of the suite "Pentaho Community Edition" has a description format owner directly recognized by PDI-EC.

Keywords: deployment, PMML, decision trees, 5.0.10 RapidMiner, Weka 3.7.2, Knime 2.1.1, 3.4 SIPINA
Tutorial: fr_Tanagra_PDI_Model_Deployment.pdf
Data : heart-pmml.zip
References:
Data Mining Group, "PMML standard "
Pentaho, " Pentaho Kettle Project "
Pentaho," Weka Scoring Using The Plugin "

0 comments:

Post a Comment