White Paper
Data Mining with XpertRule® Miner
A
White Paper by Attar Software
printable version (.pdf format)
Organisations are increasingly storing large amounts of data generated by their operating activities. Such historical data has buried within it patterns relating to the effectiveness of the various business processes. Data mining can discover such patterns in data and is now considered a catalyst for enhancing business processes through avoiding failure patterns and exploiting success patterns.
The potential for discovering knowledge buried in data has created the need for better management of corporate historic data. This has led to the concept of data warehousing, whereby operational data is maintained in a database dedicated to providing business users with online data for business analysis. A data warehouse can be a large corporate database, a departmental database (data mart) or a local database on a single client PC. The quality of knowledge that can be discovered from data is not dependent on the scale and architecture of the data warehouse. Quality is dependent on having the right data and the appropriate data mining tools and development methodology.
The business benefits of data mining have created a scramble by software suppliers to position their products as data mining tools. Anything from simple query and reporting products to the most advanced pattern discovery products have been put forward as "data mining" tools. This has caused confusion among business users as to what data mining actually means. There are three technologies for the discovery of patterns in data:
The process of discovering patterns from data (also known as Knowledge Discovery in Databases) is a process that combines all of the above technologies since it requires hypothesis, exploration and automatic discovery. It follows that the above technologies are complimentary. In addition to supporting automatic pattern generation, XpertRule Miner also supports the ability to query/report and to visualise/explore the data in conjunction with the discovered patterns.
Important considerations
when deploying Data Mining
Data mining is emerging as a mature technology which is being incorporated
into mainstream business applications. Data Mining has evolved beyond the
point where the algorithms are the main criteria for assessing the technology.
The important considerations when deploying data mining in an organisation
are:
Graphical Support
for a Data Mining Process
The effectiveness of data mining as a business intelligence tool has been
demonstrated with a large number of successful applications. However, in order
to give data mining a wider appeal it has become apparent that a methodology
or process is required to allow non data mining specialists to achieve the
same degree of success as seasoned practitioners. Such a systematic and repeatable
process will allow data mining to be successfully deployed by many people
across organisations. There are a number of initiatives and projects to develop
such a process, two of which are partly funded by the European Commission.
Attar Software has been involved directly in one of these (CRITIKAL)
and is a member of the Special Interest Group set up in conjunction with the
second (CRISP DM). It is reassuring to see a common data mining process (methodology)
starting to emerge. There is broad agreement on the main tasks within such
a process which are data preparation, data exploration, pattern discovery,
pattern validation and pattern deployment.
XpertRule Miner provides a graphical environment for supporting all the stages of the data mining process. The click, drag and drop environment allows non programmers to carry out complex data preparation, mining and deployment processes.

Data Sources
XpertRule Miner uses data drivers known as CAF servers to read/write to data
sources. The standard ODBC CAF server will support all ODBC compliant data
sources. The open architecture of the CAF drivers allows the development of
additional CAFs using the API of non ODBC data sources. CAFs for client-server
architectures are also available - for example, the TCP/IP STUB CAF.
Data Preparation
& Transformation
It is now accepted by most data mining practitioners that between 50% to 80%
of the total life cycle of a data mining project can be taken up by the data
preparation stage. The objectives of this stage is to cleanse the data and
to transform it into a format suitable for the application of pattern discovery
techniques.
XpertRule Miner allows non programmers to carry out complex data transformations using an intuitive drag and drop graphical interface. It can process data tables with millions of records. The data transformation operations supported are:

Pattern Discovery
In order to address industry wide data mining needs, XpertRule Miner supports
a basket of knowledge discovery techniques:

Tree Induction:This is goal driven discovery and is the most widely used technique involving the induction of patterns (trees) relating to a business event (goal), such as mortgage arrears, customer attrition, energy consumption, insurance claims, etc.
Interactive/incremental
Data Mining: This combines automatic tree induction and manual tree
construction. It enables the business user to develop tree patterns in collaboration
with the induction algorithm. At every node (branch) in the tree, XpertRule
Miner shows the importance of the various attributes at that point. The user
is given the opportunity to impart their background business knowledge and
influence the choice of attribute splits while respecting the information
evidence provided by Miner.

Discovering Association Rules: This is the discovery of associations
between business events. For example, which items are purchased together in
a supermarket (basket analysis), which product options are taken up together,
which faults occur together, etc. XpertRule Miner supports the discovery of
association rules and frequent item sets from transaction data of items or
events.
Discovering Clusters in data: This is the discovery of natural clusters or segmentation in data. An example would be segmenting a mortgage portfolio. XpertRule Miner generates clusters in 'case' (attribute based) data by discovering sets of attribute values that are frequently associated with each other.

Pattern Exploration
and Validation
Data visualisation and exploration plays an important role throughout the
data mining process. During the tree induction process, XpertRule Miner allows
user defined reports and data graphs to be updated dynamically as the user
is exploring the various nodes and leafs (profiles) of the discovered tree.
In addition to giving the user a method of validating the accuracy and meaning
of tree patterns, the pattern exploration process helps the user obtain a
better understanding of the patterns being discovered and their implications.
XpertRule Miner supports a number of tree exploration reports; field statistics,
frequency distribution, field propensity/value across profiles and "gain
or lift" graphs.

Pattern Deployment
Patterns discovered using data mining can be deployed in a number of ways
to address the relevant business requirements. XpertRule Miner supports a
number of deployment strategies:
Connectivity,
scalability and performance
The data mining tools available today fall into one of two distinct architectures;
XpertRule Miner resolves all the problems associated with both client and workstation based data mining by supporting a multi-tier client-server architecture. This is made possible by engineering the data mining algorithms of Miner to be multi-tier, consisting of Contingency And Frequency (CAF) servers which summarises the data and ProfilerX clients which generate and display patterns interactively. The advantages of this architecture are:
Operating requirements
XpertRule® Miner for Client based data mining
XpertRule® Miner for multi-tier data mining
Client
Middle tier
Server
Copyright © 2002 Attar Software Limited
![]() |
||||||
![]() |
||||||
|
|
||||||
|