Thursday, December 23, 2010

Hoodies You Can See Through

PCR with FactoMineR dynGraph

There are two ways of looking at the graphic representation of data in Data Mining. The first is to consider it as a tool for presenting results. The graph supports the text and tables to highlight the information produced by analysis. For example, one text ad in the sales of caps increases in winter a small curve where there are peaks and end sales earlier this year confirms that.

The second seeks to integrate the graph in the same exploratory process. Here, it becomes an additional tool for detecting patterns, peculiarities and relationships that may exist in the data. In this regard, modern software with graphics capabilities of increasingly powerful, open up incredible opportunities. As I often say a graph is widely felt much better than a series of ratios to interpret confusing or poorly controlled.

In this tutorial, we conducting a principal components analysis with the software R. We had already done previously with the procedure princomp () . Here we repeat the study with the procedure (PCA) package FactoMineR. Many indicators on the elements (variables, individuals) are active or illustrative provided directly now, greatly facilitating the task of the practitioner. It is no longer necessary to post-calculated using formulas more or less complex as we have done in previous document. Subsequently, on the basis of indicators delivered by PCA (), we will conduct an exploration using graphical Tool dynGraph eponymous package. We find that the opportunities for interactive analysis are numerous.

Keywords: R software, principal component analysis, PCA, correlation circle, illustrative variables, FactoMineR, dyngraph, interactive graphical analysis
components: PCA, dynGraph
Link: acp_avec_factominer_dyngraph.pdf
Data : acp_avec_factominer_dyngraph.zip
References:
G. Saporta, "Probability, Data Analysis and Statistics", Dunod, 2006, pages 155 to 179.
Tutorial Tanagra, " ACP - Description of vehicles "
F. Husson, J. Josse, S. Le, J. Pages FactoMineR The package for R; http://factominer.free.fr/
S. Le, J. Durand, dynGraph The package for R; http://dyngraph.free.fr/

Sunday, December 19, 2010

Honeywell Chronotherm Iv Plus Energy Plus Manual

and Tools for application development

A tutorial a little different this time. I talk about tools and programming languages for developing data mining applications.

Start a discussion about "the best programming language" is an excellent way to fill an evening computer. The underlying question is "what is the language that develops applying the most powerful, fastest ... ".

Very good boy, the atmosphere quickly becomes stormy, or even harmful. Some people, very charming for the most part, behave passionate, even passionate, rise on their high horse (tagada, tagada) in Assen arguments sometimes completely irrational. I know whereof I speak, I am when I let myself go. Yet, ultimately, in deciding what kind of debate would be pretty easy. It sufficient to characterize the problems that we seek to solve, write an equivalent code in different languages, and study the behavior of l'exécutable généré. C'est ce que nous allons faire dans ce didacticiel en nous plaçant dans deux situations couramment rencontrées lors de la programmation d'algorithmes d'exploration de données. On verra que le résultat n'est pas du tout celui qu'on attendait (si on en attendait un, ouh là là je vois déjà certains bondir), loin de là.

Tout d'abord, corrigeons un abus de langage (si je puis dire), la performance n'est pas une affaire de langage, mais plutôt une affaire de technologie et de compilateur. Nous le verrons, le même code source, compilé avec des outils différents, peut aboutir à des exécutables avec des comportements très différents. We will study in this paper: C # with Visual C # Express Edition of Microsoft Pascal using Borland Delphi 6.0, Pascal with the Free Pascal Compiler 2.2.4 Lazarus 0.9.28, C + + with Borland C + + Builder 4, C + + with Dev C + + (compiler G + +) ; executed via Java JRE1.6.0_19 Windows (Eclipse development tool I used). All these tools except Borland C + + Buile 4, are available free on the net. For all, I have selected the options that optimize the compilation speed.

Performance is evaluated by measuring the time calculations executables launched through the shell outside the IDE (Integrated Development Environment) to avoid interference. My machine is multi-heart, user time and CPU time are almost the same. We content ourselves with the first. Each program is run 10 times. We calculate the average.

Keywords: programming language, c + +, C #, Delphi, pascal, java
Tutorial: fr_Tanagra_Programming_Language.pdf
Source Code: programming_language.zip

Wednesday, December 15, 2010

Cineplex Brampton Ticket Prices

Association Rules - Transactional Data

The mining association rules is one of the flagship applications of data mining. The idea is update patterns, as co-occurrences in the database. The emblematic example is the analysis of receipts from supermarkets: they want to discover the rules of behavior such as "if the customer bought diapers and wipes, it will buy milk for growth." In which case it may be appropriate to the proper rays in the same area of the store (this is the case with regard to the supermarket I frequent usually). The "if" the rule is called "history", the "so what" is "therefore." It

is possible to find co-occurrences in the individual tables - variables that are manipulated with the usual data mining software. But often, especially through the induction of association rules, data can be in the form of a transactional basis. If we take the example of the analysis of receipts, we have a list of products by cart.

This data representation is quite natural in view of the problem we want to capture. It also has the advantage of being more compact since only the products listed are actually observed in each cart. We need not concern ourselves with products that are not, especially since they can be very numerous if one refers to the number of items that can offer a brand from supermarkets.

As far as this mode of description is natural, it turns out that many programs do not know apprehend directly. We observe curiously a real division between vocational and tools to those from academia. The first most of them can handle this file type. This is the case of software SPAD 7.3 and SAS Enterprise Miner 4.3 we study in this tutorial. The latter, however, require a prior transformation of data to work. We use a VBA macro running in Excel to transform our data base "individuals - Variable Bit suitable for treatment under Tanagra 1.4.37 and 2.2.2 Knime . Attention, we must respect the original specifications, ie focus only on rules indicating the simultaneous presence of products in shopping carts. There is no question, following a coding 'present - absent "poorly controlled, to produce rules highlighting the simultaneous absence of certain products. This may be interesting in some cases, but this is not the purpose of our analysis.

Keywords: association rule, association rules, SPAD 7.3, lock em 4.3, 2.2.2 Knime, filtering rules, lift
Components: A PRIORI
Tutorial: fr_Tanagra_Assoc_Rule_Transactions. pdf
Data : assoc_rule_transactions.zip
References:
Wikipedia, "Association rule learning "

Tuesday, December 14, 2010

» Tom Et Lola «

EPISODE 14 SEASON 6 (VO)

14/14 EPISODE SEASON 6 (VO): approx. 90 min
"CARPE DIEM (SEGUNDA PARTE)"




SEASON FINALE

SEE EPISODE:
\u0026lt;Previous the need to install Adobe Flash Player

Friday, December 10, 2010

Husband Has Dry Heaves In The Morning

decision trees on large files (update)

In a post very old (" processing large volumes - Comparison Software "- September 2008), I compared the behavior of several software when processing a file with relatively large decision trees.

I was describing inter alia the conduct of Tanagra 1.4.27 version released in August 2008. Since my development machine has changed Tanagra itself has changed, we are so far to version 1.4.37, and Sipina has also been modified (version 3.5), with the introduction of multithreading induction techniques for certain trees. I thought it was time to study the performance by re-editing experiments in the same conditions. On

Sipina Tanagra and the only software I have analyzed in this new, improved processing time is obvious. After, we must discern what is attributable to the change machine, which amounts to changes in implementations. We propose some tracks in our document.

The new results were added in the last section (Section 5) of the PDF.

Link: fr_Tanagra_Perfs_Comp_Decision_Tree.pdf