UNIT-I
1.
Explain the evolution of Database technology?
_
Data collection and Database creation
_
Database management systems
_
Advanced database systems
_
Data warehousing and Data Mining
_
Web-based Database systems
_ New
generation of Integrated information systems
2.Explain
the steps of knowledge discovery in databases?
_
Data cleaning
_
Data integration
_
Data selection
_
Data transformation
_
Data mining
_
Pattern evaluation
_
Knowledge presentation
3.
Explain the architecture of data mining system?
_
Database, datawarehouse, or other information repository
_
Database or data warehouse server
_
Knowledge base
_
Data mining engine
_
Pattern evaluation module
_
Graphical user interface
4.Explain
various tasks in data mining?
(Or)
Explain
the taxonomy of data mining tasks?
_
Predictive modeling
•
Classification
•
Regression
•
Time series analysis
_
Descriptive modeling
•
Clustering
•
Summarization
•
Association rules
•
Sequence discovery
5.Explain
various techniques in data mining?
_
Statistics (or) Statistical perspectives
_
Point estimation
•
Data summarization
•
Bayesian techniques
•
Hypothesis testing
•
Correlation
_
Regression
_
Machine learning
_
Decision trees
_
Hidden markov models
_
Artificial neural networks
_
Genetic algorithms
_ Meta learning
UNIT-II
6.Explain
the issues regarding classification and prediction?
_
Preparing the data for classification and prediction
o
Data cleaning
o
Relevance analysis
o
Data transformation
_
Comparing classification methods
o
Predictive accuracy
o Speed
o
Robustness
o
Scalability
o
Interpretability
7.Explain
classification by Decision tree induction?
_
Decision tree induction
_
Attribute selection measure.
_
Tree pruning
_
Extracting classification rules from decision trees
8.Write
short notes on patterns?
_
Pattern definition
_
Objective measures
_
Subjective measures
_ Can
a data mining system generate all of the interesting
patterns?
_ Can
a data mining system generate only interesting
patterns?
9.Explain
mining single –dimensional Boolean associated rules from transactional
databases?
_ The
apriori algorithm: Finding frequent itemsets using
candidate
generation
_
Mining frequent item sets without candidate generation
10.Explain
apriori algorithm?
_
Apriori property
_
Join steps
_
Prune step
_
Example
_
Algorithm
11.Explain
how the efficiency of apriori is improved?
_
Hash-based technique (hashing item set counts)
_
Transaction reduction (reducing the number of transactions
scanned
in future iteration)
_
Partitioning (Partitioning the data to find candidate item sets)
_
Sampling (mining on a subset of the given data)
_
Dynamic item set counting (adding candidate item sets at
different
points during a scan)
12.Explain
frequent item set without candidate without candidate generation?
_
Frequent patterns growth (or) FP-growth
_
Frequent pattern tree (or) FP-tree
_
Algorithm
13.
Explain mining Multi-dimensional Boolean association rules from transaction
databases?
_
Multi-dimensional (or) Multilevel association rules
_
Approaches to mining Multilevel association rules
•
Using uniform minimum support for all levels
•
Using reduced minimum support at lower levels
o
Level-by-level independent
o
Level-cross filtering by single
o
Level- cross filtering by k-item set
_
Checking for redundant Multilevel association rules
14.Explain
constraint-based association mining?
_
Knowledge type constraints
_
Data constraints
_
Dimension/level constraints
_
Interestingness constraints
_
Rule constraints
_
Metarule-Guided mining of association of
association
rules
_
Mining guided by additional rule constraints
Unit –III
15.Explain
regression in predictive modeling?
_
Regression definition
_
Linear regression
_
Multiple regression
_
Non-linear regression
_
Other regression models
16.Explain
statistical perspective in data mining?
_
Point estimation
_
Data summarization
_
Bayesian techniques
_
Hypothesis testing
_
Regression
_
Correlation
17.
Explain Bayesian classification.
_
Bayesian theorem
_
Naïve Bayesian classification
_
Bayesian belief networks
_
Bayesian learning
18.
Discuss the requirements of clustering in data mining.
_
Scalability
_
Ability to deal with different types of attributes
_
Discovery of clusters with arbitrary shape
_
Minimal requirements for domain knowledge to determine
input
parameters
_
Ability to deal with noisy data
_
Insensitivity to the order of input records
_
High dimensionality
_
Interpretability and usability
_
Interval scaled variables
_
Binary variables
o
Symmetric binary variables
o
Asymmetric binary variables
_
Nominal variables
_
Ordinal variables
_
Ratio-scaled variables
20.
Explain the partitioning method of clustering.
K-means
clustering
K-medoids
clustering
21.
Explain Visualization in data mining.
Various
forms of visualizing the discovered patterns
_
Rules
_
Table
_
Crosstab
_ Pie
chart
_ Bar
chart
_
Decision tree
_
Data cube
_
Histogram
_
Quantile plots
_ q-q
plots
_
Scatter plots
_
Loess curves
UNIT IV
22.
Discuss the components of data warehouse.
_
Subject-oriented
_
Integrated
_
Time-Variant
_
Non-volatile
23.
List out the differences between OLTP and OLAP.
_
Users and system orientation
_
Data contents
_
Database design
_
View
_
Access patterns
24.Discuss
the various schematic representations in multidimensional model.
_
Star schema
_
Snow flake schema
_
Fact constellation schema
25.
Explain the OLAP operations I multidimensional model.
_
Roll-up
_
Drill-down
_
Slice and dice
_
Pivot or rotate
26.
Explain the design and construction of a data warehouse.
_
Design of a data warehouse
•
Top-down view
•
Data source view
•
Data warehouse view
•
Business query view
_
Process of data warehouse design
27.Expalin
the three-tier data warehouse architecture.
_
Warehouse database server(Bottom tier)
_
OLAP server(middle tier)
_
Client(top tier)
28.
Explain indexing.
_
Definition
_
B-Tree indexing
_
Bit-map indexing
_
Join indexing
29.Write
notes on metadata repository.
_
Definition
_
Structure of the data warehouse
_
Operational metadata
_
Algorithms used for summarization
_
Mapping from operational environment to data warehouse
_
Data related to system performance
_
Business metadata
30.
Write short notes on VLDB.
_
Definition
_
Challenge related to database technologies
_
Issues in VLDB
UNIT V
31.Explain
data mining applications for Biomedical and DNA data analysis.
_
Semantic integration of heterogeneous, distributed genome databases
_
Similarity search and comparison among DNA sequences
_
Association analysis.
_
Path analysis
_
Visualization tools and genetic data analysis.
32.
Explain data mining applications fro financial data analysis.
_
Loan payment prediction and customer credit policy analysis.
_
Classification and clustering of customers fro targeted marketing.
_
Detection of money laundering and other financial crimes.
33.
Explain data mining applications for retail industry.
_
Multidimensional analysis of sales, customers, products, time and region.
_
Analysis of the effectiveness of sales campaigns.
_
Customer retention-analysis of customer loyalty.
_
Purchase recommendation and cross-reference of items.
34.
Explain data mining applications for Telecommunication industry.
_
Multidimensional analysis of telecommunication data.
_
Fraudulent pattern analysis and the identification of unusual patterns.
_
Multidimensional association and sequential pattern analysis
_ Use
of visualization tools in telecommunication data analysis.
35.
Explain DBMiner tool in data mining.
_
System architecture
_
Input and Output
_
Data mining tasks supported by the system
_
Support of task and method selection
_
Support of the KDD process
_
Main applications
_
Current status
36.
Explain how data mining is used in health care analysis.
_
Health care data mining and its aims
_
Health care data mining technique
_
Segmenting patients into groups
_
Identifying patients into groups
_
Identifying patients with recurring health problems
_
Relation between disease and symptoms
_
Curbing the treatment costs
_
Predicting medical diagnosis
_
Medical research
_
Hospital administration
_
Applications of data mining in health care
_
Conclusion
37.
Explain how data mining is used in banking industry.
_
Data collected by data mining in banking
_
Banking data mining tools
_
Mining customer data of bank
_
Mining for prediction and forecasting
_
Mining for fraud detection
_
Mining for cross selling bank services
_
Mining for identifying customer preferences
_
Applications of data mining in banking
_
Conclusion
38.
Explain the types of data mining.
_
Audio data mining
_
Video data mining
_
Image data mining
_ Scientific and statistical data mining