1. The variable whose measurement is done in terms such as height and weight are classified as .
    a)continuous variable
    b)measuring variable
    c)discrete variable
    d)flowchart variable
  2. Qualitative data is also known as .
    a) Numerical data
    b) Categorical data
    c)Discrete data
    d) Continuous data
  3. Which of the following type of data do have a natural order?
    a)Ordinal data
    b)Nominal data
    c)Binary data
    d)Continuous data
  4. Discrete data is based on count and it can take a number of
    values. a)Infinite b)simple c)complex d)finite
  5. also known as primary data which is data collected from a
    a)Ordinal data b)Ordinary data c)Existing data d)Raw data
  6. What is secondary data?
    a) Ordinal data b)Unimportant data c)Existing data d)Ordinary data
  7. Age group – Young, Adult, Senior Citizen is an example of .
    a)Nominal data b)Discrete data c)Continuous data d)Ordinal
  8. An example of discrete data is .
    a) the number of children b)height of children c)weight of children d)behaviour of children
  9. XQuery is a functional query language used to retrieve information stored in
    format. a)HTML b)XML c)UML
  10. XPath specification has types of
    a)Four b)Five c)Six d)Seven
  11. State True or False.
    (i)Data Visualization helps users in analyzing a small amount of data in a simpler way.
    (ii) Data Visualization makes complex data more accessible, understandable,
    and usable.
    a)true, false b)false, true c)true, true d)false, false
  12. Data visualization is also an element of the broader .
    a)deliver presentation architecture
    b)data presentation architecture
    c)dataset presentation architecture
    d) data process architecture
  13. Which one of the following is most basic and commonly used techniques for
    a)Line charts b)Scatter plots c)Population pyramids d)Area charts
  14. Which of the following is not a part of data science process?
    a) Discovery b) Model Planning c)Communication Building d)Operationalize
  15. Which of the following is not an application of data science?
    a) Recommendation Systems b) Image & Speech Recognition
    c) Online Price Comparison d) Privacy Checker
  16. Amazon Web Services fall into which of the following cloud-computing category?
    a) Platform as a Service
    b) Software as a Service
    c)Infrastructure as a Service
    d) Back-end as a Service
  17. Which of the following is the most important language for Data
    a)Java b)Ruby c)R d)HTML
  18. In XQeury symbol preceded before the variable
    a)@ b)$ c)# d)*
  19. MongoDB support cross platform and is written in
    a)C++ b)Java c)R d)PHP
  20. MongoDB is Database.
    a)SQL b)NoSQL c)RDBMS d)Firebas
  21. Ridge Regression is used when data suffers from .
    a)Collinearity b)Multicollinearity c)Regression
  22. Joins are used for combining product.
    a)Vector b)Euler c)Scalar d)Cartesian
  23. __ is the process of assigning storage, usually in the form of server disk drive space, in
    order to optimize the performance of a storage area network.
    a) Storage Provisioning b)Data mining c)Storage assignment d)Data Warehousing
  24. Clustering comes under learning.
    a) Supervised b)Unsupervised c)Reinforcement d)Classification
  25. In , the distance between two clusters is defined as the shortest distance between two
    points in each cluster.
    a) Single Linkage b)Complete Linkage c)Average Linkage d)Multiple Linkage
  26. In , the distance between two clusters is defined as the longest distance between two
    points in each cluster.
    a) Single Linkage b)Complete Linkage c)Average Linkage d)Multiple Linkage
  27. is the variability of model prediction for a given data point or a value which tells us spread of our
    a) Variance b) Bias c) Underfitting d) Bug
  28. A rise in prices before a festival is an example of .
    a) Cyclical variation b)Trend variation c)Irregular variation d)Seasonal variation
  29. Seasonal variations are .
    a) Long term variation b)Short term variation c)Sudden variation d)Instant variation
  30. Time series data consists of
    a)three b)six c)five d)four
  31. The best-fitted trend line is one for which sum of squares of residual or error is .
    a)maximum b)minimum c)negative d)1
  32. data is used to build a model.
    a) Training b)Testing c)Validation d)Primary
  33. Which of the following is also called as exploratory learning?
    a) Supervised learning
    b)Active learning
    c)Unsupervised learning
    d)Reinforcement learning
  34. Which of the following statement is true about prediction problem?
    a)The output attribute must be categorical.
    b)The model is designed to determine future outcomes.
    c)The output attribute must be numeric.
    d) The model is designed to classify current behaviour.
  35. Decision Nodes are represented by .
    a) Disks b)Squares c)Circles d)Triangles
  36. LASSO stands for .
    a) Least Absolute Shrinkage and Selection operator.
    b) Low Attribute Shrinkage and Selection operator.
    c)Least Attribute Shrinkage and Selection operator.
    d) Low Absolute Shrinkage and Selection operator.
  37. Another name for an input variable is .
    a) random variable b)Independent variable c)estimated variable d)dependent variable
  38. Data collected by someone else for some other purpose but being utilized by the investigator for
    another purpose is called as .
    a) Primary data b)Secondary data c)Raw data d)First hand data
  39. There are type of methods in Data
    a)two b)four c)five d)six
  40. is a graphical representation method used to depict groups of numerical data through their
    a) Histogram b)Box plot c)Scatter plot d)Line
  41. Agglomerative and Divisive are types of algorithm.
    a) Hierarchical Clustering b)Binary Classification c)Regression d)Multi-classification
  42. AIC is measured by an equation .
    a)AIC = -2k+2 b)AIC = 2LL+2k c)AIC = -2LL+2k d)AIC = 2k+2
  43. is caused by a hypothesis function that fits the available data but does not generalize well to
    predict new data.
    a) Underfitting b)Overfitting c)Low variance d)Low bias
  44. is a means of managing data that makes it more useful for users engaging in data discovery and
    a) Data curation b)Data processing c)Data Munging d)Data mining
  45. The spreadsheet is an example of data.
    a) structured data b)unstructured data c)semi structured data d)half structured
  46. CouchDB is an example of database.
    a)NoSQL b)RDBMS c)SQL
  47. SVM stands for .
    a)Standalone Validate Machine.
    b)Standalone Vector Machine.
    c)Support Validate Machine.
    d) Support Vector Machine.
  48. is the process of dimensionality reduction by which a set of data is reduced to more
    manageable groups for processing.
    a) Regression b)Feature Extraction c)Aggregating d)Feature Elimination
  49. PCA is used for .
    a) dimensionality reduction
    b) feature extraction
    c) data augmentation
    d) variance normalization
  50. State True or False.
    (i)KNN can be used in both classification and regression.
    (ii) KNN can be used in Reinforcement
    a)True, False
    b)True, True
    c)False, False
    d)False, True

