White Paper
The Wide Scale Deployment of Active Data Mining Solutions

A White Paper by Attar Software

printable version (.pdf format)


Data Mining is a process
There is increased interest in a process or methodology for data mining. It is argued that such a formalised process will widen the exploitation of data mining as an enabling technology for solving business problems. It will allow people with varying expertise in data mining and from different business sectors to carry out successful data mining projects with a high degree of consistency.
There are a number of initiatives for the development of a formal/documented data mining process both in Europe and North America. It is reassuring to the data mining community that the processes emerging from all of these initiatives reveal a large degree of similarity. There is widespread agreement on the main steps (stages) involved in such a process and any differences relate only to the detailed tasks within each stage. A summary of the major stages of a data mining process is:


The wide scale deployment of data mining solutions
A repeatable data mining process will help ensure the success of a data mining project. However, a successful data mining project also needs developers with the following skills:

The above skills may be combined in one person or may require more than one person. However, even in the largest of organisations there is a relatively small number of such specialists/teams with the above skills. To maximise the returns on data mining, the role of these specialists in a data mining project should be to prepare a specific 'Data Mining Business Scenario'. Once such a scenario is prepared it can be deployed on a much wider scale to a large user community - inside or outside the organisation. A Data Mining Business Scenario can also be called a Data Mining Solution.

Preparing a Data Mining Business Scenario involves all the steps of the data mining process; goal definition, data selection, data preparation and transformation, data exploration, pattern discovery and pattern deployment. The business scenario can be deployed to a wide user base. As an example, consider the business scenario of mortgage arrears in the portfolio of a financial institution:

Goal definition

Identify the profiles of mortgage accounts with a high or low propensity to default on mortgage payments. Define default as 3 or more months in arrears. The patterns discovered will be issued to branch managers to help them with the processing of mortgage applications. The patterns will also be issued to marketing managers to help them in their targeted marketing and in the definition of new products/mortgage packages. Finally, the patterns will be used by the Credit Manager to monitor the changes in the mortgage portfolio over time.

Data selection

Identify the source data as the Mortgage Applications data base and the monthly payments database. Furthermore, focus on historic mortgage applications, for example, those made in1996 and 1997 and all payments records from 1996 until present date.

Data Preparation

Data Exploration

Pattern Discovery

Pattern Deployment


The Deployment of Active Data Mining Solutions
The methods of deployment of patterns listed in the previous section can be described as passive deployment. This is because the solutions deployed can only utilise the patterns previously discovered. In Active Mining Deployment, the users are empowered to discover and explore new patterns within the business scenario (solution) delivered to them. For example, in the area of mortgage arrears described above, the business scenario can be prepared as described, but the Credit Manager and the users in the branches and marketing department can be given the ability to:

The active deployment of data mining, turns data mining into vertical business applications for wide scale use by business people who otherwise would not have the skills to develop a data mining process.


Technologies for the Deployment of Active Data Mining Solutions
There are a number of software technologies required in order to realise the benefits of the active deployment of data mining. These data mining software components allow the creation of vertical applications with embedded active data mining for use by business users.


XpertRule Miner for the Active Deployment of Data Mining
XpertRule Miner is designed from the ground up to enable the active deployment of data mining. It achieves this through the following features:

The ActiveX tree induction client allows data mining to be embedded within other applications or deployed over the Internet/Intranet. The CAF (Contingency AND Frequency) servers can be deployed on the client, middle tier or server and are scalable and highly performant. These CAFs can exploit the high performance available from parallel processing database servers by the firing of intelligent query streams at the server. Alternatively, the CAF server can cache data from any ODBC compliant database into a highly tokenised format which is optimised for high performance mining on very large data tables. Using this caching technique allows an ODBC data source of millions of rows to be mined in minutes on average specification machines (e.g. 300 MHz Pentium with 64MB of RAM).


Copyright © 2002 Attar Software Limited

products
case studiesdemosnewscompany
Please Contact Us

Telephone
+1 866-465-5111
+1 978-465-5111
FAX
+1 978-465-0666

E- mail
General Information:
info@IntelliCrafters.com
Evaluation Passwords:
passwords@IntelliCrafters.com
Sales:
sales@IntelliCrafters.com
Customer Support:
support@IntelliCrafters.com
Webmaster:
WebMaster@IntelliCrafters.com