Statistical Analysis with Software Application
This course reviews and expands upon core topics in probability and statistics through the study and practice of data analysis.
0.206 | The area of the standard normal curve to the right of z=0.82 is _______. | |
1 | A perfect positive correlation coefficient is equal to | |
1 | What is the value of the standard deviation in a standard normal distribution? | |
1.02 | Which is NOT a value of r ? | |
7 | There are how many data mining techniques? | |
7 | What is value of quartile 3 in 2,4,4,4,5,5,6,8,9 ? | |
7 | In 2,4,4,4,5,5,6,8,9 the range is | |
9 | If the standard deviation of a distribution is 3, the variance is | |
9.38 | In the equation of the regression line represented by Y= 1.24 X + 6.9 if X=2 then Y =? | |
10 | A score of 50 lies 2 standard deviations above a mean of 30.What is the value of the standard deviation? | |
12.25 | If the standard deviation of a distribution is 3.5, the variance is | |
18 | In α =babaa β =a^6b^5bb, what is the length of the concatenation of the two strings? | |
48 | On an examination given to 1000 students, Jef’s score of 80 was higher than the score of 480 students who took the exam. What is the percentile for Jef’s score? | |
50 | What is the value of the mean in a normal probability density function? | |
51 | If there are 101 scores the median is equal to the _____ranked score. | |
84 | A survey of 100 consumers said that the price charged for a kilo of rice could be approximated by a normal distribution with a mean of 35 and a standard deviation of 4.How many are less than 39? | |
95 | What percent of data will lie within 2 standard deviation of the mean? | |
95 | What is the value of the mean if a score of 110 is 3 standard deviation above the mean? | |
95 | A survey of 100 consumers said that the price charged for a kilo of rice could be approximated by a normal distribution with a mean of 35 and a standard deviation of 4.How many of them lie between 27 and 43? | |
95 | Empirical rule for a normal distribution that is 2 standard deviations above and below the mean is ________% of data. | |
99.7 | Empirical rule for a normal distribution that is 3 standard deviations above and below the mean covers ______% of the data. | |
Area Under the Curve | AUC means___________. | |
{ (3,4) (3,5) (2,4 ) {2,5) } | If A={ 2,3} B={4,5},which of the following is a Cartesian product of the two sets? | |
{3,5,6,10,12} | The range in R={ (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in R is | |
{3,5,6} | If R= { (3,3), (3,6), (5,5),(5,10),(6.12)} is a binary relation in R which the domain is | |
{A,C,I,S,T} | If A= { x/x is a distinct letter in the word "MATHEMATICS"} AND B={x/x is a distinct letter in the word "STATISTICS"} then their intersection is | |
2x3 | The product of a 2x5 and 5x3 matrices is a ______matrix | |
48th | On an examination given to 1000 students, Jef’s score of 80 was higher than the score of 480 students who took the exam. What is the percentile for Jef’s score? | |
5 exabytes | How many bytes of data are generated every two days in today's world? | |
52nd | If there are 103 scores the median is equal to the _____ranked score. | |
5x 8 | What is the size of the product of a 5x 6 and a 6x 8 matrices? | |
5x 8 | What is the size of the product of a 5x 6 and a 6x 8 matrices? | |
A | Which of the matrices is singular? | |
A | Which of the matrices is singular? | |
A + B = B+ A | Which of the following is TRUE? | |
A + B = B+ A | Which of the following is TRUE? | |
A + B = B+ A | Which of the following is TRUE? | |
a and b | What conditions must be satisfied in the development of a probability function for a discrete random variable? | |
analysis of algorithms | It is a process of finding the computational complexity of algorithms. | |
analysis of algorithms | It is a process of finding the computational complexity of algorithms. | |
Another term for text analytics. | Another term for text analytics. | |
ANOVA | The following are artifacts used in data analysis EXCEPT: | |
as x increases y also increases and vice versa | Positive correlation means that_______________. | |
Big beta notation | The following are large inputs EXCEPT | |
billion billion | Exabyte means ________bytes | |
bivariate | Data involving two variables. | |
bivariate | Data involving two variables are called _________data. | |
Business Intelligence | It transforms data into actionable intelligence for business purposes. | |
business intelligence | It is used in organization’s strategic and tactical business decision making. | |
Business Intelligence | It offers a way to examine trends from collected data and derive insights from it. | |
casualty | The following are abstract notions EXCEPT | |
Chi-square | Which of the following is a continuous distribution? | |
classification | Which of the following data mining techniques is predictive? | |
classification | Which of the following data mining techniques is predictive? | |
cluster analysis | It includes identifying groups of data records. | |
Cluster analysis | _____________ includes identifying groups of data record. | |
cluster analysis | It includes identifying groups of data records | |
collecting | The following processes are used in data analysis EXCEPT: | |
collecting data | Which of the following is NOT a goal in data mining? | |
Collection | The following are data mining techniques EXCEPT: | |
computational complexity theory | is an important part of a broader_____________. | |
confusion matrix | The classification table that XL Stat can display. | |
Correlation | It refers to the degree of relationship between two variables? | |
Data Science | It refers to well based theories and sound business judgement. | |
data analysis | It has the goal of discovering useful information to support decision making. | |
data analysis | The process of inspecting,cleansing,transforming and modelling data with the goal of discovering useful information. | |
data base | Which is Not an interaction data? | |
Data Mining | It is a method for discovering patterns in large data sets. | |
Data mining | The goal is to transform raw data into understandable business information. | |
Data mining | It is used to discover patterns in large data sets | |
data visualization | ___________ uses artifacts to present data visually. | |
data visualization | It makes complex data more understandable and usable. | |
data visualization | Refers to using tools of statistics to present data visually. | |
datafication | The creation of data from varied sources and its qualification into information. | |
Datafication | The creation of data from varied sources and its quantification into information. | |
datalogy | Earlier name for data science. | |
disjoint | The two sets If A={ 2,3} B={4,5} are said to be | |
Donald Knuth | He coined the term “analysis of algorithms”. | |
Donald Knuth | He coined the term “analysis of algorithms”. | |
Eric Schmidt | He pointed out that until 2003 ,all of mankind had generated just 5 exabytes of data | |
Expected | The _______value is the weighted average of the value the random variable may assume. | |
Firth | He proposed the use of a penalized likehood function. | |
frame | It views the world in thinking of prototypical objects. | |
Georg Cantor | He said that “ In mathematics the art of proposing a question must be held of higher value than solving it”. | |
George E. P. Box | “ All models are wrong but some are useful “ | |
Google Flu trends | It shows a high correlation between the incidence of flu and searches about flu on google. | |
google maps | What is a great example of data product? | |
graph | Which is NOT a basic representation technology? | |
Grouped frequency distribution | A distribution where large distribution are displayed. | |
Have same sizes. | Addition and subtraction of matrices only is possible if two are more matrices. | |
Have same sizes. | Addition and subtraction of matrices only is possible if two are more matrices. | |
hidden | The constant multiplicative factor in which algorithms are related are_______ constants. | |
Higher than the mean | A positive z-score means that the score is | |
Higher than the mean | A positive z-score means that the score is | |
Hypergeometric | Which of the following is a discrete distribution? | |
hypergeometric | Which of the following does NOT use continuous distribution? | |
I and ii | Which pair belongs to the same family of models called GLM? i) logistic ii) linear regression iii.) multinomial regression iv)probability | |
imperfect | All representations are ________. | |
inference | Any way to get new expressions from old ones. | |
Intelligent Reasoning | It is a variety of formal calculation typically deduction. | |
Intelligent Reasoning | It is a variety of formal calculation typically deduction. | |
interaction | The explosion of _______data is the main reason why every 2 days 5 exabytes of data are generated. | |
interaction | A new phenomenon for the explosion of _________data | |
Internet of things | IOT means | |
INTERNIST | It sees a set of prototypes in particular prototypical diseases to be matched against the case at hand. | |
invertible | Matrix B is | |
invertible | Matrix B is | |
it adheres to the function | Which is NOT a component of KR? | |
Java | What programming language is used in Rapid miner? | |
joint | The sets A= { x/x is a distinct letter in the word "MATHEMATICS"} and B={x/x is a distinct letter in the word "STATISTICS"} , the two sets are | |
Knime | It is a powerful tool that shows the network of data. | |
Knime | Primarily used for data pre-processing. | |
Knime | It is popular among financial data analysts. | |
Knowledge Representation | It is used to enable an entity to determine consequences by thinking rather than acting. | |
Knowledge Representation | KR means __________________________. | |
likehood | To estimate the parameters of the model ,the ________function is maximized. | |
logic | It involves a commitment in viewing the world in terms of individual entities and relations. | |
logistic | Which of the following belong to the GLM? | |
logistic regression | A frequently used method as it enables binary variables, sum polytomous variable to be modelled. | |
loop | Which of the following is NOT a module in rapid Miner? | |
Mean | The score easily affected by extreme values is the _________. | |
Median | The score NOT easily affected by extreme values. | |
Median | The score NOT easily affected by extreme values. | |
Medium for pragmatically diligent interpretation | The following are distinct roles that KR plays EXCEPT | |
Medium for pragmatically diligent interpretation | The following are distinct roles that KR plays EXCEPT | |
Medium of human expression | It is a language that we say things about the world. | |
Medium of human expression. | It is a language that we say things about the world. | |
Mode | The number that occurs most frequently is called________. | |
Mode | The number that occurs most frequently is called________. | |
multimodal | A distribution with 4 modes is said to be a _________distribution. | |
multinomial logit model | It corresponds to the case where the dependent variable has more than 2 categories. | |
network topology | A network purpoting to describe family memberships. | |
normal | A bell-shaped distribution that is symmetric about a vertical line? | |
Normal | The most widely used continuous probability distribution. | |
normal distribution | A bell shaped curve that is symmetric about a vertical line. | |
null | Another term for an empty set. | |
null set | The intersection of the two sets A={ 2,3} B={4,5} is a | |
One | The integral of all the values of a random variable in a probability density function is equal to______. | |
ontological | KR is a set of __________commitments. | |
Orange | It is a perfect software which is written in Python computing language. | |
orange | it is a perfect software for machine learning. | |
Pearson r | Which of the following is used as a method for Correlation? | |
Predictive Analytics World | PAW means____________. | |
Probability density | Which function provides the value of a function at any particular value of x but does NOT directly give the probability of the random variable? | |
PROBIT | The most common functions used to link probability to the explanatory variables are the LOGIT model and ________model. | |
profile likehood | It does NOT require the assumption that the parameters are normally distributed. | |
profile likehood | The method that does not require the assumption that parameters are normally distributed. | |
python | What programming language doe Orange use? | |
Python | Which of the following is NOT a data mining tool? | |
Q2=median | Which of the following statements is TRUE? | |
R-programming | It is a free software programming language. | |
R-programming | Which is primarily written in C and in Fortran? | |
R-programming | It is a free software programming language. | |
Rapid miner | _____________ is rated as the number one business analytics software. | |
reasoning | It is a process that goes on internally while most things it wishes about exists only externally. | |
Regression | Which of the following pertains to predictive data mining technique? | |
regression | Which of the following is a predictive data mining technique? | |
Regression | The equation of the _______line predicts the value of Y given X. | |
ROC | It enables the performance of a model and enables a comparison to be made with other models. | |
roles | Which is NOT a KR technology? | |
rule based | It views the world in terms of attributes object value triples. | |
run time analysis | It is a theoretical classification that estimates and anticipates the increase increase in running time for algorithms. | |
sensitivity | The proportion of a well-defined classified positive events. | |
sensitivity | The proportion of a well defined positive event is called _________________. | |
sequence | A special type of function where the domain is a set of consecutive integers. | |
Sociology | The following provided inspirations of what constitutes intelligent reasoning EXCEPT | |
space complexity | It relates the length of an algorithm to the number of storage location it uses. | |
space complexity | It relates the length of an algorithm to the number of storage location it uses. | |
speaking | These are the data skills that a good data scientist need to cultivate EXCEPT | |
Spearman rho | The method of correlation used for ranked score is ________. | |
SPSS | The following are softwares used in data mining EXCEPT | |
square | A matrix that has the same number of rows and columns is called | |
Standard | The normal distribution with a mean of 0 and standard deviation of 1. | |
Statistics Analytics | Which of the following is NOT a method used in data analysis? | |
studio | It is a module in rapid miner that considers the workflow. | |
studio | It is used for prototyping in Rapid miner. | |
surrogate | KR as a _________is a substitute for the thing itself. | |
Text mining | It expands available data enormously since there is so much more text being generated than numbers. | |
Text analytics | It extracts meaningful numerical indices from information and make it available to statistical and | |
Text Analytics | What is the process of deriving useful information from text? | |
The correct answers are: Mean, Median, Mode | Which of the following is TRUE when a distribution is normal? | |
there is no mode. | If in a distribution all scores are distinct then_____________. | |
time complexity | It relates the length of an algorithm’s input to the number of steps it takes. | |
time complexity | It relates the length of an algorithm’s input to the number of steps it takes. | |
Turing machine | An example of an abstract computer. | |
unstructured | Which of the following type of text is processed in text analytics? | |
unstructured | What type of text are processed in Text analytics? | |
veracity | The following are the 3V's of big data EXCEPT | |
WEKA | It is a collection of machine learning algorithms for data mining task. | |
William Gibson | The person who said that “ The future is not google-able”. | |
worst case | The function describing the performance of an algorithm is usually an upper bound determined from ______inputs. | |
x increases y decreases | A negative correlation exists when___________. | |
Zynga Incorporated | The developer of farmville, a famous game in the internet. | |
λ | The symbol used to indicate strings with no elements. | |
λ | Null strings are indicated by |