This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)
No issues specified. Please specify issues, or remove this template. |
RapidMiner (formerly YALE (Yet Another Learning Environment)) is an open source environment for machine learning, data mining, text mining, predictive analytics, and business analytics. Data mining processes in RapidMiner can be made up of a large number of arbitrarily nestable operators, described in XML files which are created with RapidMiner's graphical user interface. RapidMiner is used for research, education, training, rapid prototyping, application development, and industrial deployments. In a poll by the data-mining newspaper KDnuggets, RapidMiner ranked second between data mining/analytic tools used for real projects in 2009[1] and first in 2010.[2]
The RapidMiner open source project, formely named YALE, was initiated by Ralf Klinkenberg, Ingo Mierswa, and Simon Fischer. The initial version of RapidMiner was developed by the Artificial Intelligence Unit of University of Dortmund since 2001. It is distributed under the AGPL license, and has been hosted by SourceForge since 2004. In 2006, Ingo Mierswa and Ralf Klinkenberg founded the company Rapid-I that now supports the further development of RapidMiner as main contributor. In addition, more than 30 developers word-wide contribute improvements and extensions to the software.
RapidMiner provides more than 600 operators for all main data mining and machine learning procedures, including import, export, data loading and transformation (ETL), data preprocessing and visualization, modelling, evaluation, and deployment. RapidMiner is written in the Java programming language and therefore runs on all popular operating systems. It also integrates learning schemes and attribute evaluators of the Weka machine learning environment and statistical modelling schemes of the R-Project. According to SourceForge, RapidMiner is used in more than 60 countries world-wide.
This article's tone or style may not reflect the encyclopedic tone used on Wikipedia. See Wikipedia's guide to writing better articles for suggestions. (February 2011) (Learn how and when to remove this message)
|
The Community Edition of RapidMiner (formerly "Yale") is an open source toolkit for data mining. Its strengths reside in part in its ability to easily define analytical steps (especially when compared with R), and in generating graphs more easily[citation needed] than e.g., R, or more effectively[citation needed] than MS Excel.
RapidMiner is well suited[citation needed] for analyzing data generated by high-throughput instruments, e.g., genotyping, proteomics, and mass spectrometry.
Example applications:
Notable selected features of RapidMiner:
RapidMiner provides a GUI to design an analytical pipeline (the "operator tree" in RapidMiner parlance). The GUI generates an XML (eXtensible Markup Language) file that defines the analytical processes the user wishes to apply to the data. This file is then read by RapidMiner to run the analyses automatically.
While these are running, the GUI can also be used to interactively control and inspect running processes.
Other ways of using RapidMiner involve calling RapidMiner from e.g., a Perl program. The Java application programming interface ("API") provides clear interfaces for applying operators individually (i.e., no need to create an operator tree), providing the ability to bypass the GUI and controlling analytical processes directly.
Last, one can also call individual RapidMiner functions directly from the command line.
RapidMiner runs on Java, it can be installed on any computer on which Java runs. I can run as a GUI application or also as a command-line tool on a server.
Although the core of RapidMiner is open-source and is offered free of charge as a "Community Edition", there is also "Enterprise Edition", that is, according to the site, "Community Edition + More Features + Services + Guarantees"[1] RapidMiner source is also offered under proprietary commercial license, to allow integration in closed-source solutions.
RapidMiner flexibility allows it use with text mining, multimedia mining, feature engineering, data stream mining and tracking drifting concepts, development of ensemble methods, and distributed data mining. RapidMiner is found in the Electronic Industry, Energy Industry, Automobile Industry, Commerce, Aviation, Telecommunications, Banking and Insurance, Production, IT Industry, Market Research, Pharmaceutical Industry, Universities and other Miscellaneous businesses (i.e. sports teams, train station, police station). For specific examples of each business area can be referenced # Reference: [2]
Some properties of RapidMiner are:
The Rapdiminer can be easily extended by additional plugins.
Today the Rapidminer contain more than 15 extensions, which advances scope of its aplicability to: text mining, image processing, time series processing, web mining, statistics, visualization, semantics, paralelization of computation process, automatic process design (PaREn Automatic System Construction Wizard) and others).
Several of the extensions can be found directly in the application in so-called Extension manager. The other extensions can be downloaded from webs of their respective developers.