# DATA SCIENCE MCQ

DATA SCIENCE

1. The variable whose measurement is done in terms such as height and weight are classified as .
a)continuous variable
b)measuring variable
c)discrete variable
d)flowchart variable
2. Qualitative data is also known as .
a) Numerical data
b) Categorical data
c)Discrete data
d) Continuous data
3. Which of the following type of data do have a natural order?
a)Ordinal data
b)Nominal data
c)Binary data
d)Continuous data
4. Discrete data is based on count and it can take a number of
values. a)Infinite b)simple c)complex d)finite
5. also known as primary data which is data collected from a
source.
a)Ordinal data b)Ordinary data c)Existing data d)Raw data
6. What is secondary data?
a) Ordinal data b)Unimportant data c)Existing data d)Ordinary data
7. Age group – Young, Adult, Senior Citizen is an example of .
a)Nominal data b)Discrete data c)Continuous data d)Ordinal
data
8. An example of discrete data is .
a) the number of children b)height of children c)weight of children d)behaviour of children
9. XQuery is a functional query language used to retrieve information stored in
format. a)HTML b)XML c)UML
d)Jscript
10. XPath specification has types of
nodes.
a)Four b)Five c)Six d)Seven
11. State True or False.
(i)Data Visualization helps users in analyzing a small amount of data in a simpler way.
(ii) Data Visualization makes complex data more accessible, understandable,
and usable.
a)true, false b)false, true c)true, true d)false, false
12. Data visualization is also an element of the broader .
a)deliver presentation architecture
b)data presentation architecture
c)dataset presentation architecture
d) data process architecture
13. Which one of the following is most basic and commonly used techniques for
EDA?
a)Line charts b)Scatter plots c)Population pyramids d)Area charts
14. Which of the following is not a part of data science process?
a) Discovery b) Model Planning c)Communication Building d)Operationalize
15. Which of the following is not an application of data science?
a) Recommendation Systems b) Image & Speech Recognition
c) Online Price Comparison d) Privacy Checker
16. Amazon Web Services fall into which of the following cloud-computing category?
a) Platform as a Service
b) Software as a Service
c)Infrastructure as a Service
d) Back-end as a Service
17. Which of the following is the most important language for Data
Science?
a)Java b)Ruby c)R d)HTML
18. In XQeury symbol preceded before the variable
name.
a)@ b)\$ c)# d)*
19. MongoDB support cross platform and is written in
language.
a)C++ b)Java c)R d)PHP
20. MongoDB is Database.
a)SQL b)NoSQL c)RDBMS d)Firebas
21. Ridge Regression is used when data suffers from .
a)Collinearity b)Multicollinearity c)Regression
d)Classification
22. Joins are used for combining product.
a)Vector b)Euler c)Scalar d)Cartesian
23. __ is the process of assigning storage, usually in the form of server disk drive space, in
order to optimize the performance of a storage area network.
a) Storage Provisioning b)Data mining c)Storage assignment d)Data Warehousing
24. Clustering comes under learning.
a) Supervised b)Unsupervised c)Reinforcement d)Classification
25. In , the distance between two clusters is defined as the shortest distance between two
points in each cluster.
26. In , the distance between two clusters is defined as the longest distance between two
points in each cluster.
27. is the variability of model prediction for a given data point or a value which tells us spread of our
data.
a) Variance b) Bias c) Underfitting d) Bug
28. A rise in prices before a festival is an example of .
a) Cyclical variation b)Trend variation c)Irregular variation d)Seasonal variation
29. Seasonal variations are .
a) Long term variation b)Short term variation c)Sudden variation d)Instant variation
30. Time series data consists of
components.
a)three b)six c)five d)four
31. The best-fitted trend line is one for which sum of squares of residual or error is .
a)maximum b)minimum c)negative d)1
32. data is used to build a model.
a) Training b)Testing c)Validation d)Primary
33. Which of the following is also called as exploratory learning?
a) Supervised learning
b)Active learning
c)Unsupervised learning
d)Reinforcement learning
34. Which of the following statement is true about prediction problem?
a)The output attribute must be categorical.
b)The model is designed to determine future outcomes.
c)The output attribute must be numeric.
d) The model is designed to classify current behaviour.
35. Decision Nodes are represented by .
a) Disks b)Squares c)Circles d)Triangles
36. LASSO stands for .
a) Least Absolute Shrinkage and Selection operator.
b) Low Attribute Shrinkage and Selection operator.
c)Least Attribute Shrinkage and Selection operator.
d) Low Absolute Shrinkage and Selection operator.
37. Another name for an input variable is .
a) random variable b)Independent variable c)estimated variable d)dependent variable
38. Data collected by someone else for some other purpose but being utilized by the investigator for
another purpose is called as .
a) Primary data b)Secondary data c)Raw data d)First hand data
39. There are type of methods in Data
Collection.
a)two b)four c)five d)six
40. is a graphical representation method used to depict groups of numerical data through their
quartiles.
a) Histogram b)Box plot c)Scatter plot d)Line
41. Agglomerative and Divisive are types of algorithm.
a) Hierarchical Clustering b)Binary Classification c)Regression d)Multi-classification
42. AIC is measured by an equation .
a)AIC = -2k+2 b)AIC = 2LL+2k c)AIC = -2LL+2k d)AIC = 2k+2
43. is caused by a hypothesis function that fits the available data but does not generalize well to
predict new data.
a) Underfitting b)Overfitting c)Low variance d)Low bias
44. is a means of managing data that makes it more useful for users engaging in data discovery and
analysis.
a) Data curation b)Data processing c)Data Munging d)Data mining
45. The spreadsheet is an example of data.
a) structured data b)unstructured data c)semi structured data d)half structured
46. CouchDB is an example of database.
a)NoSQL b)RDBMS c)SQL
d)DBMS
47. SVM stands for .
a)Standalone Validate Machine.
b)Standalone Vector Machine.
c)Support Validate Machine.
d) Support Vector Machine.
48. is the process of dimensionality reduction by which a set of data is reduced to more
manageable groups for processing.
a) Regression b)Feature Extraction c)Aggregating d)Feature Elimination
49. PCA is used for .
a) dimensionality reduction
b) feature extraction
c) data augmentation
d) variance normalization
50. State True or False.
(i)KNN can be used in both classification and regression.
(ii) KNN can be used in Reinforcement
learning.
a)True, False
b)True, True
c)False, False
d)False, True