White Paper
Data Mining with XpertRule® Miner
A White Paper by Attar Software

printable version (.pdf format)


Organisations are increasingly storing large amounts of data generated by their operating activities. Such historical data has buried within it patterns relating to the effectiveness of the various business processes. Data mining can discover such patterns in data and is now considered a catalyst for enhancing business processes through avoiding failure patterns and exploiting success patterns.

The potential for discovering knowledge buried in data has created the need for better management of corporate historic data. This has led to the concept of data warehousing, whereby operational data is maintained in a database dedicated to providing business users with online data for business analysis. A data warehouse can be a large corporate database, a departmental database (data mart) or a local database on a single client PC. The quality of knowledge that can be discovered from data is not dependent on the scale and architecture of the data warehouse. Quality is dependent on having the right data and the appropriate data mining tools and development methodology.

The business benefits of data mining have created a scramble by software suppliers to position their products as data mining tools. Anything from simple query and reporting products to the most advanced pattern discovery products have been put forward as "data mining" tools. This has caused confusion among business users as to what data mining actually means. There are three technologies for the discovery of patterns in data:

The process of discovering patterns from data (also known as Knowledge Discovery in Databases) is a process that combines all of the above technologies since it requires hypothesis, exploration and automatic discovery. It follows that the above technologies are complimentary. In addition to supporting automatic pattern generation, XpertRule Miner also supports the ability to query/report and to visualise/explore the data in conjunction with the discovered patterns.


Important considerations when deploying Data Mining
Data mining is emerging as a mature technology which is being incorporated into mainstream business applications. Data Mining has evolved beyond the point where the algorithms are the main criteria for assessing the technology. The important considerations when deploying data mining in an organisation are:


Graphical Support for a Data Mining Process
The effectiveness of data mining as a business intelligence tool has been demonstrated with a large number of successful applications. However, in order to give data mining a wider appeal it has become apparent that a methodology or process is required to allow non data mining specialists to achieve the same degree of success as seasoned practitioners. Such a systematic and repeatable process will allow data mining to be successfully deployed by many people across organisations. There are a number of initiatives and projects to develop such a process, two of which are partly funded by the European Commission. Attar Software has been involved directly in one of these (CRITIKAL) and is a member of the Special Interest Group set up in conjunction with the second (CRISP DM). It is reassuring to see a common data mining process (methodology) starting to emerge. There is broad agreement on the main tasks within such a process which are data preparation, data exploration, pattern discovery, pattern validation and pattern deployment.

XpertRule Miner provides a graphical environment for supporting all the stages of the data mining process. The click, drag and drop environment allows non programmers to carry out complex data preparation, mining and deployment processes.


Data Sources
XpertRule Miner uses data drivers known as CAF servers to read/write to data sources. The standard ODBC CAF server will support all ODBC compliant data sources. The open architecture of the CAF drivers allows the development of additional CAFs using the API of non ODBC data sources. CAFs for client-server architectures are also available - for example, the TCP/IP STUB CAF.


Data Preparation & Transformation
It is now accepted by most data mining practitioners that between 50% to 80% of the total life cycle of a data mining project can be taken up by the data preparation stage. The objectives of this stage is to cleanse the data and to transform it into a format suitable for the application of pattern discovery techniques.

XpertRule Miner allows non programmers to carry out complex data transformations using an intuitive drag and drop graphical interface. It can process data tables with millions of records. The data transformation operations supported are:


Pattern Discovery
In order to address industry wide data mining needs, XpertRule Miner supports a basket of knowledge discovery techniques:

Tree Induction:This is goal driven discovery and is the most widely used technique involving the induction of patterns (trees) relating to a business event (goal), such as mortgage arrears, customer attrition, energy consumption, insurance claims, etc.

Interactive/incremental Data Mining: This combines automatic tree induction and manual tree construction. It enables the business user to develop tree patterns in collaboration with the induction algorithm. At every node (branch) in the tree, XpertRule Miner shows the importance of the various attributes at that point. The user is given the opportunity to impart their background business knowledge and influence the choice of attribute splits while respecting the information evidence provided by Miner.


Discovering Association Rules: This is the discovery of associations between business events. For example, which items are purchased together in a supermarket (basket analysis), which product options are taken up together, which faults occur together, etc. XpertRule Miner supports the discovery of association rules and frequent item sets from transaction data of items or events.

Discovering Clusters in data: This is the discovery of natural clusters or segmentation in data. An example would be segmenting a mortgage portfolio. XpertRule Miner generates clusters in 'case' (attribute based) data by discovering sets of attribute values that are frequently associated with each other.


Pattern Exploration and Validation
Data visualisation and exploration plays an important role throughout the data mining process. During the tree induction process, XpertRule Miner allows user defined reports and data graphs to be updated dynamically as the user is exploring the various nodes and leafs (profiles) of the discovered tree. In addition to giving the user a method of validating the accuracy and meaning of tree patterns, the pattern exploration process helps the user obtain a better understanding of the patterns being discovered and their implications. XpertRule Miner supports a number of tree exploration reports; field statistics, frequency distribution, field propensity/value across profiles and "gain or lift" graphs.


Pattern Deployment
Patterns discovered using data mining can be deployed in a number of ways to address the relevant business requirements. XpertRule Miner supports a number of deployment strategies:


Connectivity, scalability and performance
The data mining tools available today fall into one of two distinct architectures;

XpertRule Miner resolves all the problems associated with both client and workstation based data mining by supporting a multi-tier client-server architecture. This is made possible by engineering the data mining algorithms of Miner to be multi-tier, consisting of Contingency And Frequency (CAF) servers which summarises the data and ProfilerX clients which generate and display patterns interactively. The advantages of this architecture are:


Operating requirements

XpertRule® Miner for Client based data mining


XpertRule® Miner for multi-tier data mining

Client

Middle tier

Server


Copyright © 2002 Attar Software Limited

products
case studiesdemosnewscompany
Please Contact Us

Telephone
+1 866-465-5111
+1 978-465-5111
FAX
+1 978-465-0666

E- mail
General Information:
info@IntelliCrafters.com
Evaluation Passwords:
passwords@IntelliCrafters.com

Sales:
sales@IntelliCrafters.com
Customer Support:
support@IntelliCrafters.com
Webmaster:
WebMaster@IntelliCrafters.com