Blogger Widgets

Total Page visits

Sunday, July 14, 2013

Data Warehousing and Data Mining,16 Mark Questions with Hints



UNIT-I

1. Explain the evolution of Database technology?
_ Data collection and Database creation
_ Database management systems
_ Advanced database systems
_ Data warehousing and Data Mining
_ Web-based Database systems
_ New generation of Integrated information systems
2.Explain the steps of knowledge discovery in databases?
_ Data cleaning
_ Data integration
_ Data selection
_ Data transformation
_ Data mining
_ Pattern evaluation
_ Knowledge presentation

3. Explain the architecture of data mining system?
_ Database, datawarehouse, or other information repository
_ Database or data warehouse server
_ Knowledge base
_ Data mining engine
_ Pattern evaluation module
_ Graphical user interface

 4.Explain various tasks in data mining?
(Or)
Explain the taxonomy of data mining tasks?
_ Predictive modeling
• Classification
• Regression
• Time series analysis
_ Descriptive modeling
• Clustering
• Summarization
• Association rules
• Sequence discovery

5.Explain various techniques in data mining?
_ Statistics (or) Statistical perspectives
_ Point estimation
• Data summarization
• Bayesian techniques
• Hypothesis testing
• Correlation
_ Regression
_ Machine learning
_ Decision trees
_ Hidden markov models
_ Artificial neural networks
_ Genetic algorithms
_ Meta learning
 
UNIT-II

6.Explain the issues regarding classification and prediction?
_ Preparing the data for classification and prediction
o Data cleaning
o Relevance analysis
o Data transformation
_ Comparing classification methods
o Predictive accuracy
o Speed
o Robustness
o Scalability
o Interpretability
7.Explain classification by Decision tree induction?
_ Decision tree induction
_ Attribute selection measure.
_ Tree pruning
_ Extracting classification rules from decision trees
8.Write short notes on patterns?
_ Pattern definition
_ Objective measures
_ Subjective measures
_ Can a data mining system generate all of the interesting
patterns?
_ Can a data mining system generate only interesting
patterns?
9.Explain mining single –dimensional Boolean associated rules from transactional
databases?
_ The apriori algorithm: Finding frequent itemsets using
candidate generation
_ Mining frequent item sets without candidate generation
10.Explain apriori algorithm?
_ Apriori property
_ Join steps
_ Prune step
_ Example
_ Algorithm

11.Explain how the efficiency of apriori is improved?
_ Hash-based technique (hashing item set counts)
_ Transaction reduction (reducing the number of transactions
scanned in future iteration)
_ Partitioning (Partitioning the data to find candidate item sets)
_ Sampling (mining on a subset of the given data)
_ Dynamic item set counting (adding candidate item sets at
different points during a scan)

12.Explain frequent item set without candidate without candidate generation?
_ Frequent patterns growth (or) FP-growth
_ Frequent pattern tree (or) FP-tree
_ Algorithm

13. Explain mining Multi-dimensional Boolean association rules from transaction
databases?
_ Multi-dimensional (or) Multilevel association rules
_ Approaches to mining Multilevel association rules
• Using uniform minimum support for all levels
• Using reduced minimum support at lower levels
o Level-by-level independent
o Level-cross filtering by single
o Level- cross filtering by k-item set
_ Checking for redundant Multilevel association rules

14.Explain constraint-based association mining?
_ Knowledge type constraints
_ Data constraints
_ Dimension/level constraints
_ Interestingness constraints
_ Rule constraints
_ Metarule-Guided mining of association of
association rules
_ Mining guided by additional rule constraints

Unit –III
15.Explain regression in predictive modeling?
_ Regression definition
_ Linear regression
_ Multiple regression
_ Non-linear regression
_ Other regression models
16.Explain statistical perspective in data mining?
_ Point estimation
_ Data summarization
_ Bayesian techniques
_ Hypothesis testing
_ Regression
_ Correlation
17. Explain Bayesian classification.
_ Bayesian theorem
_ Naïve Bayesian classification
_ Bayesian belief networks
_ Bayesian learning
18. Discuss the requirements of clustering in data mining.
_ Scalability
_ Ability to deal with different types of attributes
_ Discovery of clusters with arbitrary shape
_ Minimal requirements for domain knowledge to determine
input parameters
_ Ability to deal with noisy data
_ Insensitivity to the order of input records
_ High dimensionality
_ Interpretability and usability
_ Interval scaled variables
_ Binary variables
o Symmetric binary variables
o Asymmetric binary variables
_ Nominal variables
_ Ordinal variables
_ Ratio-scaled variables
20. Explain the partitioning method of clustering.
K-means clustering
K-medoids clustering
21. Explain Visualization in data mining.
Various forms of visualizing the discovered patterns
_ Rules
_ Table
_ Crosstab
_ Pie chart
_ Bar chart
_ Decision tree
_ Data cube
_ Histogram
_ Quantile plots
_ q-q plots
_ Scatter plots
_ Loess curves

UNIT IV

22. Discuss the components of data warehouse.
_ Subject-oriented
_ Integrated
_ Time-Variant
_ Non-volatile
23. List out the differences between OLTP and OLAP.
_ Users and system orientation
_ Data contents
_ Database design
_ View
_ Access patterns
24.Discuss the various schematic representations in multidimensional model.
_ Star schema
_ Snow flake schema
_ Fact constellation schema
25. Explain the OLAP operations I multidimensional model.
_ Roll-up
_ Drill-down
_ Slice and dice
_ Pivot or rotate
26. Explain the design and construction of a data warehouse.
_ Design of a data warehouse
• Top-down view
• Data source view
• Data warehouse view
• Business query view
_ Process of data warehouse design

27.Expalin the three-tier data warehouse architecture.
_ Warehouse database server(Bottom tier)
_ OLAP server(middle tier)
_ Client(top tier)
28. Explain indexing.
_ Definition
_ B-Tree indexing
_ Bit-map indexing
_ Join indexing
29.Write notes on metadata repository.
_ Definition
_ Structure of the data warehouse
_ Operational metadata
_ Algorithms used for summarization
_ Mapping from operational environment to data warehouse
_ Data related to system performance
_ Business metadata
30. Write short notes on VLDB.
_ Definition
_ Challenge related to database technologies
_ Issues in VLDB

UNIT V

31.Explain data mining applications for Biomedical and DNA data analysis.
_ Semantic integration of heterogeneous, distributed genome databases
_ Similarity search and comparison among DNA sequences
_ Association analysis.
_ Path analysis
_ Visualization tools and genetic data analysis.
32. Explain data mining applications fro financial data analysis.
_ Loan payment prediction and customer credit policy analysis.
_ Classification and clustering of customers fro targeted marketing.
_ Detection of money laundering and other financial crimes.
33. Explain data mining applications for retail industry.
_ Multidimensional analysis of sales, customers, products, time and region.
_ Analysis of the effectiveness of sales campaigns.
_ Customer retention-analysis of customer loyalty.
_ Purchase recommendation and cross-reference of items.
34. Explain data mining applications for Telecommunication industry.
_ Multidimensional analysis of telecommunication data.
_ Fraudulent pattern analysis and the identification of unusual patterns.
_ Multidimensional association and sequential pattern analysis
_ Use of visualization tools in telecommunication data analysis.

35. Explain DBMiner tool in data mining.
_ System architecture
_ Input and Output
_ Data mining tasks supported by the system
_ Support of task and method selection
_ Support of the KDD process
_ Main applications
_ Current status
36. Explain how data mining is used in health care analysis.
_ Health care data mining and its aims
_ Health care data mining technique
_ Segmenting patients into groups
_ Identifying patients into groups
_ Identifying patients with recurring health problems
_ Relation between disease and symptoms
_ Curbing the treatment costs
_ Predicting medical diagnosis
_ Medical research
_ Hospital administration
_ Applications of data mining in health care
_ Conclusion

37. Explain how data mining is used in banking industry.
_ Data collected by data mining in banking
_ Banking data mining tools
_ Mining customer data of bank
_ Mining for prediction and forecasting
_ Mining for fraud detection
_ Mining for cross selling bank services
_ Mining for identifying customer preferences
_ Applications of data mining in banking
_ Conclusion

38. Explain the types of data mining.
_ Audio data mining
_ Video data mining
_ Image data mining
_ Scientific and statistical data mining

No comments: