Monday, August 30, 2010

Authentic Palestinian Scarf

Sipina Login / Excel via OLE [XL-SIPINA]

The connection between a data mining software and Excel (spreadsheets and more generally) is a major challenge. We had repeatedly addressed in our tutorials. Over time, the solution based on the use of add-ins (add-in) was imposed for both SIPINA que pour TANAGRA . Elle est simple, fiable, performante. Elle ne nécessite pas développer des versions spécifiques. La connexion avec Excel est une simple fonctionnalité additionnelle de la distribution standard.

Avant de parvenir à cette solution, nous avions exploré différentes pistes. Dans ce didacticiel, nous présentons la solution XL-SIPINA basée sur la technologie OLE de Microsoft. A contre-pied des macros complémentaires, cette version de SIPINA choisit d'intégrer Excel dans le logiciel de Data Mining. Le dispositif fonctionne plutôt bien. Néanmoins, il a finalement été abandonné pour deux raisons : (1) nous étions required to develop / compile special versions that only work if Excel is present on the user's machine, (2) the time of transfer "Excel object - Sipina" via OLE prove prohibitive when the database size increases .

must therefore be taken as a SIPINA XL-style exercise. There is always a bit of nostalgia when I look back on tracks that I explored and I finally abandoned. Maybe also I'm not completely gone after things.

last remark. The original application was developed using Office 97. I realize it is still relevant still, it works fine with Office 2010.

Keywords: excel, spreadsheet, SIPINA, xls, xlsx, xl SIPINA, decision trees
Software : XL- SIPINA
Tutorial: fr_xls_sipina.pdf
Data: cars

Friday, August 27, 2010

Superhuman Soap Dispensor

The add-in for Excel 2007 Tanagra and 2010

The add ("add-in" in English) "tanagra.xla" contributes greatly to the spread of software Tanagra. The principle is simple, it involves integrating a menu in Excel Tanagra. Thus the user can launch the statistical calculations without having to leave the spreadsheet. Simple as it Regardless, this feature facilitates the work of the Data Miner. The spreadsheet is one of the most used tools for data preparation (see KDNuggets Polls: Tools / Languages for Data Cleaning - 2008 ). By integrating the data mining software in this environment, the practitioner avoids repetitive and tedious manipulations: import, export, check the compatibility of formats, etc..

Installing the add-in in Office XP (valid Office 1997 to Office 2003) is described in one of our tutorials . The procedure lapses in Office 2007 and Office 2010 since the menus of Excel were reorganized. Yet the macro will work. It is a shame that the users can not enjoy it.

In this tutorial, we detail the steps to follow to integrate the macro Tanagra in new versions of Excel. We will focus on Office 2007 in a first step, we see that the procedure is also valid for Office 2010. This transition to newer versions of Excel is absolutely not trivial. Indeed, compared to previous methods, they can manage a larger number of rows and columns. We can treat up to 1,048,575 base observations (the first line corresponds to the names of variables) et 16.384 variables .

Nous traiterons pour notre part une base comportant 100.000 observations et 22 variables. Il s'agit d'une version du fichier " waveform " bien connu des informaticiens. Notons que ce fichier, de par le nombre de lignes, ne peut pas être manipulé par les versions antérieures d'Excel.

La procédure décrite dans ce document est également valable pour la macro complémentaire associée au logiciel SIPINA ( sipina.xla ).

Mots-clés : importation des données, fichier excel, macro complémentaire, add-in, add-on, xls, xlsx
Composants : VIEW DATASET
Link: fr_Tanagra_Add_In_Excel_2007_2010.pdf
Data : wave100k.xlsx
References:
Tanagra, " Import XLS (Excel) - Add-.
Tanagra, "Connection Open Office Calc .
Tanagra, " Connecting Open Office Calc on Linux .
Tanagra, " Excel Connection - Sipina "