Data Mining applications have refined the art of detecting variations and patterns in voluminous data sets for prediction of desired types of results. Its characteristics and advantages have made it very popular among companies. It can be effectively used for increasing profits, reducing unnecessary costs, working out/ understanding user’s interests and many more.

What is Data Mining

Data Mining is the computer-assisted process of extracting knowledge from large amount of data.

In other words, data mining derives its name as Data + Mining the same way in which mining is done in the ground to find a valuable ore, data mining is done to find valuable information in the dataset.

Data Mining tools predict customer habits, predict patterns and future trends, allowing business to increase company revenues and make proactive decisions.

How Data Mining Works

Data Mining Architecture

Fig. 1 – Data Mining Architecture

User Interface may be any website. A product is searched in the Database, Database Warehouse, World Wide Web and other repository (bottom Part of Figure 1). This means that the data searched will be fetched from all over net.

The data will then be cleansed to avoid noise, error in data and unwanted data with the help of parser. Then the selective data will be integrated and all the data will be fetched by Data Ware House Server. With the help of knowledge base and pattern evaluation, the result will be given to interface.

Let’s take ‘Amazon’ as an example to understand it better. If a user has sent request to a User Interface (Amazon) to search for a phone within the range of a defined amount, then it will search in its Knowledge Base (similar kind of information is stored) for similar requests processed earlier.

If the same pattern is evaluated, then the result will be given to the user with the help of data-mining engine which will further ask data warehouse server to fetch phone within range of that searched amount.

It will also search all over the net and then it will clean, integrate and give details back to data-mining engine. It will also store the information it in its knowledge base for future trend analysis. Post this process, the interface will be provided the desired result.

Characteristics of Data Mining

The characteristics of Data Mining are:

  • Prediction of likely outcomes
  • Focus on large datasets and database
  • Automatic pattern predictions based on behavior analysis
  • Calculation – To calculate a feature from other features, any SQL expression can be calculated.

Types of Data Mining

The Data Mining Analysis can be divided in two basic parts. They are:

  • Predictive Data Mining Analysis
  • Descriptive Data Mining Analysis

Types of Data Mining

Fig. 2 – Types of Data Mining

Predictive Data Mining Analysis

As the name signifies, Predictive Data-Mining analysis works on the data that may help to project what may happen later in business.

Predictive Data-Mining Tasks can be further divided into four type. They are:

  • Classification Analysis
  • Regression Analysis
  • Time Serious Analysis
  • Prediction Analysis

Classification Analysis

It is a used to fetch important and relevant information about data and metadata. It classifies a data in various categories it belongs to. Email provider is the best example of classification analysis. They use algorithms that can classify the mail as legitimate or mark it as spam

Regression Analysis

It tries to state the dependency between variables. It is generally used for forecasting and prediction.

Time Serious Analysis

It is a sequence of well-defined data points measured at consistent time interval.

Prediction Analysis

It is related with time series but the time is not bound.

Descriptive Data Mining Tasks

Its purpose is to summarize or turn data into relevant information.

Descriptive Data-Mining Tasks can be further divided into four types. They are:

  • Clustering Analysis
  • Summarization Analysis
  • Association Rules Analysis
  • Sequence Discovery Analysis

Clustering Analysis

It is the process of identifying data sets that are similar to one other.

For example – clusters of customers with similar buying behavior can be clubbed with similar products, to increase the conversion rate.

Summarization Analysis

It involves techniques for finding a compact description of a dataset.

Association Rule Learning

This method helps in identifying some interesting relations different variables in large databases. The best example is of the retail industry.

As and when some festive season approaches retail store stock, up with the chocolates in which sale increases before any festival time, which is achieved with the help of data-mining.

Sequence Discovery Analysis

It is about finding a sequence of an activity.

For example – In a store user may often buy shaving gel before razor. It’s all about in what sequence the user buying the product and based on that store owner can arrange the items.

Data Mining Application Areas

Application Areas of Data Mining

Fig. 3 – Application Areas of Data Mining

Data-Mining is used in various fields such as:

  • Telecommunications and credit card companies.
  • Insurance companies/stock exchanges – apply data-mining techniques to reduce fraud
  • Medical applications – to predict the effectiveness of surgical procedures, medical tests or medications.
  • Retailers – data mining helps in to identify which promotion and coupon to be applied and which product to be stored.
  • Pharmaceutical firms

Advantages of Data Mining

The advantages of Data-Mining are:

Customer Behavior and Habits

Data Mining is useful in keeping track of customer behavior and habits.

For example – If a customer is on amazon and looking for a particular offer which data-mining has already predicted and saved in its database, then the habits (particular product) can easily be identified.

Trend Analysis

A trend/pattern that customer mostly follows when he is on a particular site is one of the most common benefit data mining provides.

Marketing Campaigns

Data-Mining helps to identify the customer response through some surveys for certain products.

Disadvantages of Data Mining

The disadvantages of Data-Mining are:

Privacy Concern

In data-mining system, safety and security measures are very less. Each and every data is captured, messages, social media content all data is available very easily so misuse of information is possible.

Incomplete data

Data mining system can provide data within its own limits.

Irrelevant Information

Additional irrelevant information gathered.

Author: Nidhi Sethi, MCA

 

Also Read:
Class C Amplifier – Working Principle, Applications, Advantages & Disadvantages
Big Data – Categories, Attributes, Applications & Hadoop