1.Define
Data mining.
It refers to extracting or “mining” knowledge from large
amount of data. Data
mining
is a process of discovering interesting knowledge from large amounts of data
stored
either, in database, data warehouse, or other information repositories
2.Give
some alternative terms for data mining.
• Knowledge mining
• Knowledge extraction
• Data/pattern analysis.
• Data Archaeology
• Data dredging
3.What
is KDD.
KDD-Knowledge Discovery in Databases.
4.What
are the steps involved in KDD process.
• Data cleaning
• Data Mining
• Pattern Evaluation
• Knowledge Presentation
• Data Integration
• Data Selection
• Data Transformation
5.What
is the use of the knowledge base?
Knowledge base is domain knowledge that is used to guide
search or evaluate the
interestingness
of resulting pattern. Such knowledge can include concept hierarchies used
to
organize attribute /attribute values in to different levels of abstraction of Data
Mining.
6.Arcitecture
of a typical data mining system.
Knowledge
base
7.Mention
some of the data mining techniques.
• Statistics
• Machine learning
• Decision Tree
• Hidden markov models
• Artificial Intelligence
• Genetic Algorithm
• Meta learning
8.Give
few statistical techniques.
• Point Estimation
• Data Summarization
• Bayesian Techniques
• Testing Hypothesis
• Correlation
• Regression
9.What
is meta learning.
Concept of combining the predictions made from multiple
models of data
mining
and analyzing those predictions to formulate a new and previously unknown
prediction.
·
GUI
·
Pattern Evaluation
·
Database or Data warehouse
·
server
·
DB DW
10.Define
Genetic algorithm.
• Search algorithm.
• Enables us to locate optimal binary string by processing
an initial
random population of binary strings by performing
operations such as
artificial mutation , crossover and selection.
11.What
is the purpose of Data mining Technique?
It
provides a way to use various data mining tasks.
12.Define
Predictive model.
It is
used to predict the values of data by making use of known results from a
different
set of sample data.
13.Data
mining tasks that are belongs to predictive model
• Classification
• Regression
• Time series analysis
14.Define
descriptive model
• It is used to determine the patterns and relationships
in a sample data. Data
mining tasks that belongs to descriptive model:
• Clustering
• Summarization
• Association rules
• Sequence discovery
15.
Define the term summarization
The summarization of a large chunk of data contained in a
web page or a
document.
Summarization
= caharcterization=generalization
16.
List out the advanced database systems.
• Extended-relational databases
• Object-oriented databases
• Deductive databases
• Spatial databases
• Temporal databases
• Multimedia databases
• Active databases
• Scientific databases
• Knowledge databases
17.
Define cluster analysis
Cluster analyses data objects without consulting a known
class label. The class
labels
are not present in the training data simply because they are not known to begin
with.
18.Classifications
of Data mining systems.
•
Based on the kinds of databases mined:
o According to model
_ Relational mining system
_ Transactional mining system
_ Object-oriented mining system
_ Object-Relational mining system
_ Data warehouse mining system
o Types of Data
_ Spatial data mining system
_ Time series data mining system
_ Text data mining system
_ Multimedia data mining system
•
Based on kinds of Knowledge mined
o According to functionalities
_ Characterization
_ Discrimination
_ Association
_ Classification
_ Clustering
_ Outlier analysis
_ Evolution analysis
o According to levels of abstraction of the knowledge
mined
_ Generalized knowledge (High level of abstraction)
_ Primitive-level knowledge (Raw data level)
o According to mine data regularities versus mine data
irregularities
•
Based on kinds of techniques utilized
o According to user interaction
_ Autonomous systems
_ Interactive exploratory system
_ Query-driven systems
o According to methods of data analysis
_ Database-oriented
_ Data warehouse-oriented
_ Machine learning
_ Statistics
_ Visualization
_ Pattern recognition
_ Neural networks
•
Based on applications adopted
o Finance
o Telecommunication
o DNA
o Stock markets
o E-mail and so on
19.Describe
challenges to data mining regarding data mining methodology and user
interaction
issues.
• Mining different kinds of knowledge in databases
• Interactive mining of knowledge at multiple levels of
abstraction
• Incorporation of background knowledge
• Data mining query languages and ad hoc data mining
• Presentation and visualization of data mining results
• Handling noisy or incomplete data
• Pattern evaluation
20.Describe
challenges to data mining regarding performance issues.
• Efficiency and scalability of data mining algorithms
• Parallel, distributed, and incremental mining algorithms
21.Describe
issues relating to the diversity of database types.
• Handling of relational and complex types of data
• Mining information from heterogeneous databases and
global information
Systems
22.What
is meant by pattern?
Pattern represents knowledge if it is easily understood by
humans; valid on test
data
with some degree of certainty; and potentially useful, novel,or validates a
hunch
about
which the used was curious. Measures of pattern interestingness, either
objective or
subjective,
can be used to guide the discovery process.
23.How
is a data warehouse different from a database?
Data warehouse is a repository of multiple heterogeneous
data sources, organized
under
a unified schema at a single site in order to facilitate management
decision-making.
Database consists of a collection of interrelated
data.
No comments:
Post a Comment