absolute measure - a measure of some quantity, i.e. weight.
absolute value - the positive value of any score.
additive component - the a in the linear transformation equation X' = a + b X. The constant that is added in a linear transformation.
alpha - the probability of rejecting the null hypothesis when in fact the null hypothesis is true. The probability of deciding that the effects are real when in fact the results were due to chance. Alpha is directly set by the researcher.
analysis of variance - a hypothesis testing procedure that tests for effects by comparing two or more means.
ANOVA - see analysis of variance.
area under a curve - the total area underneath a curve defined using a mathematical equation between two perpendicular lines corresponding to two scores on the x axis.
area under theoretical models of distributions - a method of estimating probabilities.
Bayesian Statistics - a branch of statistics whose foundation is the recomputation of probabilities based on data.
bimodal - a distribution with two different scores occurring the same number of times with the greatest frequency. A distribution with two modes.
bivariate data - data that contains two variables (x, y); also called paired data.
causal variable - changes in values of this variable are directly related to changes in the variable it causes.
causation - the establishment of a direct link between two variables, usually done using the experimental method.
central limit theorem - relates the sampling distribution of the mean to the theoretical model of the distribution of scores. The central limit theorem comes in a variety of flavors, but generally stated says that the sampling distribution of the mean will be a normal distribution with a theoretical mean equal to mu and a theoretical standard deviation, called the standard error, equal to sigma of the model of scores divided by the square root of the sample size. In theory the central limit theorem requires that the sample size approach infinity, but in practice the results converge with relatively small sample sizes (N>10).
Central Limit Theorem - the mean of the sampling distribution of the mean equals the mean of the population model and that the standard error of the mean equals the standard deviation of the population model divided by the square root of N as the sample size gets infinitely larger (N-> ? ).
Central Limit Theorem - relates the sampling distribution of the mean to the theoretical model of the distribution of scores. The central limit theorem comes in a variety of flavors, but generally stated says that the sampling distribution of the mean will be a normal distribution with a theoretical mean equal to mu and a theoretical standard deviation, called the standard error, equal to sigma of the model of scores divided by the square root of the sample size. In theory the central limit theorem requires that the sample size approach infinity, but in practice the results converge with relatively small sample sizes (N>30).
central tendency - a typical or representative score; mean, median, and mode are measures of central tendency.
chi-square statistic - a measure of the difference between observed and expected values.
chi-squared distribution - a theoretical probability model, described by a single parameter, called degrees of freedom. In this model, scores are positively skewed and range from zero to infinity.
compound event - a combination of simple events joined with either "and" or "or."
compound event - a combination of simple events joined with either "and" or "or."
compound probabilities - the probability of a compound event.
compound probabilities - the probability of a compound event.
computational formula for the standard error of estimate - a formula to compute the standard error of estimate that includes the variance of Y and the correlation coefficient. It is easier to compute than the definitional formula because it does not require a table of squared residuals to be computed.
conditional distribution - a distribution of a variable given a particular value of another variable.
conditional distribution - a distribution of a variable given a particular value of another variable.
conditional probability - the probability of an event given that another event is true.
conditional probability - the probability of an event given that another event is true.
confidence interval - a pair of scores that describe a theoretical range of values of a score.
constant - a value that does not change with the different values for the counter variable (i).
control condition - in an experiment, a condition identical to the treatment condition except no treatment is given.
correlation - changes in one variable are related to changes in another variable, they "co-relate".
correlation coefficient - a measure of relationship between two variables. Conventionally this measure may take on value from minus one to one.
correlation coefficient - a measure of the degree of linear relationship between two variables.
correlation coefficients - numbers between minus one and one that measure the linear relationship between two variables.
correlation matrix - a table of all possible correlation coefficients between a set of variables.
correlation matrix - a table of all possible correlation coefficients between a set of variables.
crossed design - experimental design in which each subject sees each level of the treatment condition.
degrees of freedom - the number of scores that are free to vary.
df - see degrees of freedom.
effect - when a change in one thing is associated with a change in another; the changes may be either quantitative or qualitative.
effect - when a change in one thing is associated with a change in another the changes may be either quantitative or qualitative.
estimator - a statistic used to estimate a model parameter.
exact significance level - the probability of the results of the study given the null hypothesis model is true
exact significance level - the probability of finding an effect equal to or larger than the effect found in the study given that the null hypothesis is true.
expected utility theory - a mathematical theory combining cost and probabilities.
expected utility theory - a mathematical theory combining cost and probabilities.
experimental condition - in an experiment, the level of treatment in which some treatment is given.
experimental design - the manner in which an experiment is set up; specifically, the way the treatments are administered to subjects.
experiment-wise error rate - the probability of committing at least one type I error somewhere in the analysis.
F-distribution - a theoretical probability distribution characterized by two parameters, df 1 and df 2.
F-distribution - a theoretical probability distribution characterized by two parameters, df1 and df2, both of which affect the shape of the distribution; the distribution is nonsymmetrical, skewed in the positive direction.
form board test - one of the earliest Psychological tests where the score of the person being tested is the time it takes to place a number of pegs in a board of cut-out forms.
fractions - are an algebraic phrase involving two numbers connected by the operator "/".
F-ratio - the Mean Squares Between divided by the Mean Squares Within; a measure of how different the means are relative to the variability within each sample.
hypothesis tests - procedures for making rational decisions about the reality of effects.
hypothesis tests - procedures for making rational decisions about the reality of effects.
intercept - another name for the additive component in a linear transformation. When a line is drawn on a plane, the line will cross the y-axis at the intercept.
intercept - the a value that defines where the line crosses the Y-axis in a regression model.
interval estimate - see confidence interval.
invariant - does not change.
inverse relationship - a relationship between two variables where in general, as one variable becomes larger, the other becomes smaller.
IQ scale - test scores have a mean of 100 and a standard deviation of either 15 or 16, depending upon the test selected.
least-squares criterion - a value the minimizes the sum of squared differences between the scores and the predicted values.
linear transformation - a transformation of the form X' = a + bX.
linear transformations - a transformation where each score multiplied by a constant and then a different constant is added to the resulting product.
mean - the sum of the scores divided by the number of scores.
mean - the sum of the scores divided by the number of scores; the most-often used measure of central tendency.
Mean Squares Between - the variance of the means times the number of scores within each group, an estimate of the theoretical variance of scores.
Mean Squares Within - the mean of the variances, an estimate of the theoretical variance of scores.
median - the score value which cuts the distribution in half.
median - the score value that cuts the distribution in half, such that half the scores fall above the median and half fall below it; a measure of central tendency.
mode - the most frequently occurring score value.
mode - the most frequently occurring score value; on a frequency distribution it is the score value that corresponds to the highest point; a measure of central tendency.
model - A model is a representation containing the essential structure of some object or event in the real world.
MSB - see Mean Squares Between.
MSW - see Mean Squares Within.
mu - one of two parameters of normal curves. Mu defines the center of the distribution.
multiple R - the correlation coefficient between the observed and predicted Y values.
multiple t-tests - hypothesis testing procedure when there are more than two groups that compares all possible pairs of means using a t-test.
multiplicative component - the b in the linear transformation equation X' = a + b X. The constant that is multiplied times the score in a linear transformation.
negative correlation coefficient - If one variable increases, the other variable decreases; and if one decreases, the other increases.
negatively skewed distribution - an asymmetrical distribution that points in the negative direction, with the mean being smaller than the median, which is smaller than the mode.
nested design - experimental design in which, each subject receives one, and only one, treatment condition.
nested t-test - an hypothesis testing procedure for nested designs with two levels.
non-optimal regression model - a regression model that does not meet the least squares criterion.
null hypothesis - the hypothesis that there were no effects.
null hypothesis - the hypothesis that there were no effects.
null hypothesis - there are no effects. Chance or random variation is responsible for any differences discovered.
one-tailed t-test - a direction t test where alpha is placed in a single tail of the distribution under the null hypothesis.
optimal regression model - a regression model that meets the least squares criterion.
outlier - a score that falls outside the range of the rest of the scores on the scatter plot.
outlier - a score that falls outside the range of the rest of the scores on the scatter plot.
parameters - variables that change the shape of the probability model.
path analysis - a branch of correlational analysis that attempts to establish causation from correlational evidence.
percentile rank - the percentage of scores that fall below a given score.
percentile rank based on the normal curve - the percentage of scores that fall below a given score in a hypothetical distribution of scores based on the normal curve.
percentile rank based on the sample - the percentage of scores that fall below a given score within a sample of scores.
percentile ranks - the percentage of scores that fall below a given score.
point estimate - a single value that represents the best predicted value of Y.
population distribution - a theoretical probability model.
positive correlation coefficient - If one variable increases (or decreases), the other variable also increases (or decreases).
positively skewed distribution - an asymmetrical distribution that points in the positive direction, with the mode smaller than the median, which is smaller than the mean.
predicted variable - the variable being predicted, the dependent variable.
predictor variable - the variable used to predict, the independent variable.
probability - a theory of uncertainty.
probability models - a mathematical equation used to model a relative frequency distribution.
probability theory - defines probabilities of simple events in algebraic terms and then presents rules for combining the probabilities of simple events into probabilities of complex events given that certain conditions are present (assumptions are met); a mathematical model of uncertainty; defines probabilities of simple events in algebraic terms and then presents rules for combining the probabilities of simple events into probabilities of complex events given that certain conditions are present (assumptions are met).
probability theory - a mathematical model of uncertainty; defines probabilities of simple events in algebraic terms and then presents rules for combining the probabilities of simple events into probabilities of complex events given that certain conditions are present (assumptions are met).
range - is the largest score minus the smallest score.
range - a measure of variability the largest score minus the smallest score.
raw score - the score that is given.
regression - a movement backwards toward the mean.
regression analysis - application of linear regression procedures, including parameter and error estimation techniques.
regression coefficients - the values of the regression weights.
regression line - the representation of the regression model on a scatter plot.
regression model - used to predict one variable from one or more other variables.
relational database - a number of flat tables linked together with index variables. Complex queries and tables can be constructed with relational databases
relative measure - a measure of a variable relative to some other measure. The ratio of weight to height would be a relative measure of weight.
residuals - deviations of observed and predicted values.
sample distribution - the distribution resulting from the collection of actual data.
sample statistics - mathematical equation used to measure properties of samples. Sample statistics are used as estimators of parameters in the probability models.
sampling distribution - a theoretical distribution of a sample statistic.
sampling distribution - a theoretical distribution of a sample statistic.
sampling distribution - a theoretical distribution of a sample statistic.
sampling distribution - a theoretical distribution of a sample statistic.
scatter plot - a visual representation of the relationship between the X and Y variables.
sig. - the probability of the results of the study given the null hypothesis model is true
sigma - one of two parameters of normal curves. Sigma defines the spread or dispersion of the distribution.
significance level - see alpha
simple linear regression - a prediction model of the form Y' = a + bX.
skewed distribution - a distribution that is asymmetrical, and in which the mean, median, and mode do not all fall at the same point.
slope - another name for the multiplicative component in a linear transformation. When a line is drawn on a plane, the steepness of the line will be determined by the slope.
slope - the value of b in the regression equation Y' = a + bX.
squared correlation coefficient - the proportion of variance in Y.
standard deviation - a measure of variability; the positive square root of the variance.
standard error - the theoretical standard deviation of a sampling distribution.
standard error of estimate - a measure of error in prediction.
standard normal curve - a member of the family of normal curves with ? = 0.0 and ? = 1.0.
standard score transformation - is a linear transformation such that the transformed mean and standard deviation are 0 and 1 respectively.
standard scores - a linear transformation such that the transformed mean and standard deviation are 0 and 1 respectively; also called z-scores.
stanine transformation - scores are linearly transformed to a distribution with a mean of 5 and a standard deviation of 2 and the decimals are dropped, so that the numbers are integers between one and nine.
subjective probabilities - probabilities obtained by procedures designed to extract "degree of belief" from individuals.
subjective probabilities - probabilities obtained by procedures designed to extract "degree of belief" from individuals.
subscripted variables - a method by which large numbers of variables can easily be represented; its form is Xi, where the X is the variable name and the subscript (i) is a counter variable that can take on values from 1 to N.
sum of squared deviations - the sum of the squared differences between the observed and predicted values of Y.
summation notation - a scheme that provides a means of representing both a large number of variables and the summation of an algebraic expression.
summation sign - used to represent summation in an expression.
symmetrical distribution - a distribution in which the mean, median, and mode all fall at the same point. If drawn, cut out, and folded the two sides would be identical.
t distribution - a theoretical probability distribution.
t distribution - a theoretical distribution that is symmetrical, bell-shaped, has tails approaching the x-axis but never touching, and total area under the curve equal to one. The t distribution has three parameters, degrees of freedom, mu, and sigma. The fewer the degrees of freedom, the flatter the t distribution is relative to the normal distribution.
T score - score that has been transformed into a scale with a mean of 50 and a standard deviation of 10.
transformations - are rules for rewriting sentences in the language of algebra without changing their meaning, or truth value.
transformations - a procedure that converts a number into another number.
transformed scores - raw scores that have been converted into another number. Generally transformed scores can be more easily interpreted than raw scores.
treatment - quantitatively or qualitatively different levels of experience.
treatment condition - any of the levels of treatment in an experiment.
t-test - an hypothesis test employing the t distribution.
two-tailed t-test - alpha is divided in half and placed in both tails of the distribution under the null hypothesis
Type I error - the null hypothesis is rejected when in fact it is true. The hypothesis testing procedure decides that the effects are real when if fact the results were due to chance.
Type II error - the null hypothesis is retained when in fact the alternative hypothesis is true. The hypothesis testing procedure decides that the no effects model could explain the results when in fact the effects were real.
utility - the gain or loss experienced by a player depending upon the outcome of the game.
utility - the gain or loss experienced by a player depending upon the outcome of the game.
variability - the spread or dispersion of scores; three measures of variability are the range, the variance, and the standard deviation.
variance - a measure of variability.
variance - a measure of score dispersion.
vectors - lines from the origin to a point on a graph, sometimes represented as points on a graph.
1.1 The function of statistics that organizes and summarizes sets of data is known as the __________ function
1.2 A human limitation that statistics help overcome is
1.3 Two ways of describing data in statistics are
1.4 Summary numbers used to describe a set of numbers are called
1.5 If each individual in the population is equally likely to be included in the sample, the sample is called a
1.6 The function of statistics that uses a sample to describe a hypothetical population model is called the ______ function.
1.7 When using the inferential function of statistics, it is
1.8 Two advantages of sampling include
1.10 Inferential statistics involves a trade-off between
2.1 Statistical models are an example of
2.2 A model is a(n)________ containing the essential structure of some object or event in the real world.
2.3 A model is a representation containing the ______ structure of some object or event in the real world.
2.4 Which of the following statements best illustrates the scientific approach to knowledge
2.5 A major characteristic of models includes
2.6 Which of the following is the greatest advantage of symbolic models over physical models?
2.7 A parameter is a variable in a
2.8 When the parameters in a mathematical model are each assigned a number
2.9 The syntax of a formal language
2.10 The syntax of a natural language
2.11 Using transformations with a formal language
2.12 Much of what is learned in a course in Algebra can be characterized as
2.13 This stage in the model-building process identifies the relevant features of the real world.
2.14 In this stage in the model-building process sentences in the language are transformed into other statements in the language.
2.15 Using the model-building process,
2.16 Simple models that have a great deal of explanatory power are called
2.17 A major limitation of mathematical models includes the inability to
2.18 When using a physical model to design a boat hull
2.19 When using a mathematical model to design a boat hull
3.1 The set of rules that determines which strings belong to the language and which do not, is called the _____ of the language.
3.2 Pi and e are mathematical symbols that stand for
3.3 Symbols that can stand for any number are called
3.4 Mathematical verbs are called
3.5 Delimiters in algebra correspond to ____ in a natural language.
3.6 Rules of precedence are employed in constructing algebraic sentences
3.7 The algrbraic sentence ((X + Y) - 3) + Z may be rewritten as
3.8 When adding and subtracting fractions _____ is usually easier.
3.9 Exponential notation is an example of algebraic
3.10 When faced with a negative exponent,
3.12 Simplifying an algebraic expression
3.13 Evaluating an algebraic sentence from the innermost parenthesis out means
4.2 All systems of measurement
4.3 To the extent that relationships that exists between the attributes of objects in the real world are preserved in the numbers that are assigned these objects,
4.4 The property of measurement systems that results when objects containing more of an attribute are given a bigger number is called
4.5 The interval property of a measurement system is critical to ____ numbers with meaning.
4.6 The property of rational zero is necessary
4.7 Numbers on the backs of football players would constitute
4.8 Classifying candidates as either Democratic, Republican, or Independent is an example of measurement of a(n) _____ scale.
4.9 In general, means and standard deviations cannot be unambiguously interpreted with nominal-categorical scale except when
4.10 Which of the following measurement scales provides the least amount of information?
4.11 Which of the following measurement scales provides the most amount of information?
4.12 The interval property
4.13 The distance around your forehead measured with a tape measure as a measure of your intelligence would be
4.14 When using a ruler to measure distance, the critical question is
4.15 With respect to measurement systems, the critical question is
5.1 The sum of the frequency column in a frequency table
5.2 The real limits of an 8.5 shoe size are
5.3 The width of each bar in an absolute frequency histogram corresponds to
5.4 The height of each bar in an absolute frequency histogram corresponds to
5.5 The difference between a bar graph and a histogram in SPSS is that in a bar graph
5.6 In an absolute frequency polygon, when the frequency of a given score is zero
5.7 The relative frequency of a given score value is
5.8 Generally, _____ frequency polygons are more useful than ____ frequency polygons.
5.9 The absolute cumulative frequency of a given score value is
5.10 The dot should be placed above the upper real limit of the interval in a(n) ___ frequency polygon.
5.11 The sum of the column of relative cumulative frequency will always
5.12 A cumulative frequency polygon presents information about
5.13 A cumulative frequency polygon will always be
5.14 The highest point on a relative cumulative frequency polygon will always be
5.15 Given the following relative cumulative frequency polygon of golf scores for 18 holes of golf, which of the following statements is most likely true.
5.16 Given the following relative cumulative frequency polygon of golf scores for 18 holes of golf, which of the following statements best describes whether the graph is correctly drawn.
5.17 Given the following relative cumulative frequency polygon of golf scores for 18 holes of golf, which of the following statements best describes whether the graph is correctly drawn.
5.18 Given the following relative cumulative frequency polygon of golf scores for 18 holes of golf, which of the following statements best describes whether the graph is correctly drawn.
6.1 When drawing overlapping relative frequency distributions, the sum of each column of relative frequencies must equal
6.2 The following graph presents two overlapping relative frequency polygons illustrating the relationship between life span and whether or not an individual takes at least a one-week vacation every year. Which of the following is most true, based on the graph.
6.3 The following graph presents two overlapping relative frequency polygons illustrating the relationship between life span and whether or not an individual takes at least a one-week vacation every year. Which of the following is most true, based on the graph.
6.4 The following graph shows two overlapping relative cumulative frequency polygons, showing the relationship between two golfers, John and David, and their golf scores. In golf, lower scores are better than higher scores. Which of the following appears to be most true, based on the graphs.
6.5 The following graph shows two overlapping relative cumulative frequency polygons, showing the relationship between two golfers, John and David, and their golf scores. In golf, lower scores are better than higher scores. Which of the following appears to be most true, based on the graphs.
6.6 The following graph shows two overlapping relative cumulative frequency polygons, showing the relationship between two golfers, John and David, and their golf scores. Which of the following appears to be most true, based on the graphs.
6.7 The following graph shows two overlapping relative cumulative frequency polygons, showing the relationship between two golfers, John and David, and their golf scores. In golf, lower scores are better than higher scores. Which of the following appears to be most true, based on the graphs.
6.8 When a variable that has over seven levels is used in a contingency table
6.9 In a contingency table the sum of the row marginal frequencies is ___ the sum of the column marginal frequencies.
6.10 Proportions in the cells of a contingency table should be calculated relative to
6.11 In the following contingency table the total N was
6.12 In the following contingency table, if a man showed a sexual preference toward males, he
6.13 In the following contingency table most men had a sexual preference for
6.14 In the following contingency table, the proportion of men who preferred females as sexual partners and who were also HIV positive is
7.1 When a frequency polygon is called saw-toothed, it means
7.2 In selecting an interval size for a grouped frequency polygon, there is a tradeoff between
7.3 The width of each bar in a histogram corresponds to the
7.4 If the high score was 94, the low score was 28, and the number of desired intervals was 10, the interval size should be
7.5 With a selected interval size of 7 and a low score of 24, the first interval should be:
7.6 With a selected interval size of 9 and a low score of 24, the first interval should be:
7.7 An odd interval size is usually selected in order to insure
7.8 The apparent lower limit is conventionally defined as
7.9 The real limits of an interval are
7.10 The difference between the apparent upper limit and the apparent lower limit will be _____ the difference between the upper real limit and the lower real limit.
7.11 A larger interval size will
7.12 Increasing the interval size increases the amount of information in the grouped frequency polygon.
7.13 Of the following, which is a negative aspect of grouping data into class intervals?
7.14 When selecting an interval size for a grouped frequency polygon the recommendation of the author of this text is
8.1 Probability models are used in statistics
8.2 Suppose a researcher collected information about the shoe sizes of everyone in a class of thirty students and drew a relative frequency polygon. Suppose the researcher repeated the data collection in a different class of thirty students, with the same distribution of males and females. The second relative frequency polygon
8.4 A uniform distribution might be a reasonable model of
8.5 A negative exponential distribution might be a reasonable model of
8.6 A normal distribution might be a reasonable model of
8.7 The area under a probability model between two points is called
8.8 Properties of probability models include
9.1 Parameters in the Normal Curve model are symbolized with
9.2 The normal curve is best viewed as
9.3 All members of the family of normal curves
9.4 The "X" in the algebraic expression for the normal curve
9.5 The number of members in the family of normal curves is
9.6 All members of the family of normal curves will appear similar
9.7 All members of the family of normal curves
9.9 The branch of mathematics concerned with finding area under curves is
9.10 Areas under curves between points
9.11 When drawing a normal curve, the tick on the X-axis corresponding to middle of the distribution is labeled with
9.12 The value of sigma on the above normal curve would be
9.13 The value of mu on the above normal curve would be
9.14 The portion of the normal curve that is usually drawn
9.15 On the Wechsler Intelligence scale for Children (m = 100 and s = 15). What percent of the children will score between 70 and 100?
9.16 When drawing a normal curve, the shape of the changes as a function of
9.17 The standard normal curve
9.18 In a standard normal distribution, mu is set to
10.1 A subscripted variable
10.3 The "i=1" in the bottom of the summation notation
10.4 The notation (EX)(EY) indicates that we have
10.5 When the summation sign is used without additional notation
10.6 In an algebraic expression with a summation sign, if the parentheses are located after the summation sign
10.7 When the expression being summed contains a "+" or "-" at the highest level
10.8 A constant with respect to the summation sign
10.9 Which of the following statements is false with respect to algebraic expression that contain the summation sign
11.1 A statistic is an algebraic expression combining scores
11.2 "They must be important because they named the whole course after them." refers to:
11.3 What function(s) do statistics serve
11.4 If a teacher was asked to give a single number that best represented a set of student scores, the teacher should select
11.6 Which of the following measures of central tendency is sensitive to extreme scores
11.7 The mode is a quick and dirty measure of central tendency because
11.8 In calculating the median, if there are an even number of scores the median
11.10 The measure of central tendency that splits the frequency distribution in half is the
11.12 When a distribution is characterized an extreme score
11.13 For the mean to represent a meaningful measure of central tendency, it is necessary to assume that the data conform to at least a(n)
11.14 The extent to which a distribution appears asymmetrical reflects its relative
11.15 In a distribution of scores for which Mean = 75.6, Median = 74, Mode = 70, it was found that a mistake had been made on one score. Instead of 80, the score should have been 100. Consequently, which one of the above measures of central tendency would certainly be incorrect?
11.16 In a distribution of scores for which Mean = 75.6, Median = 74, Mode = 70, it was found that a mistake had been made on one score. Instead of 70, the score should have been 90. Consequently, which one of the above measures of central tendency would possibly be incorrect?
11.17 If most students in your class had read this chapter so carefully that they know the answers to almost all questions on the test, the scores would probably be
11.18 In attempting to calculate the monthly earnings from her software distribution business, Ms. Dotcom found the monthly mean to be $3000 and the median to be $3500. The distribution she is dealing with is probably
11.19 A measure of variability describes
11.20 Following are four sets of measures. Which shows the least variability?
11.21 If height measured in inches was the unit of measurement, the variance would be measured in
11.22 The square root of the variance is equal to the
11.23 In most instances the _____ is the most preferred measure of variability.
11.24 Ms. DeAngelos chemistry class had a standard deviation of 2.4 on a standardized test, while Ms. Ropers chemistry class had a standard deviation of 1.2 on the same test (see above). What can be said about these two classes?
11.25 The range is used as a measure of
11.26 Examples of statistics are:
11.27 When using a calculator to find statistics of a sample of data
11.28 Given the following breakdown table which of the following groups had the smallest mean?
11.29 Given the following breakdown table which of the following groups had the largest mean?
11.30 Given the following breakdown table which of the following groups had the largest standard deviation?
11.31 Given the following breakdown table which group contained the student who scored the lowest?
11.32 Given the following breakdown table which of the following groups would you be most confident in estimating the score on the geography test?
11.33 Given the following breakdown table which of the following groups had the smallest N?
11.34 Given the following breakdown table how many total students answered the survey?
11.35 Given the following breakdown table in all cases, if a student visited more states
12.1 To ensure that a raw score of 57 means the same thing on two different tests, a ______ would be employed.
12.2 One purpose of score transformations is to
12.3 The two major categories of scores transformations are
12.4 The computational procedure for finding the percentile rank based on the sample might be best described as
12.5 In the set of data {11 12 12 14 15 15 15 18 19 19 19 21} the frequency within of a score of 19 would be:
12.6 In the set of data {11 12 12 14 15 15 15 18 19 19 19 21} the frequency below of a score of 19 would be:
12.7 In computing the percentile rank based on the sample, the highest score in the sample
12.8 The percentile rank based on the sample of the highest score
12.9 The computational procedure for finding the percentile rank based on the normal curve might be best described as
12.10 The computational procedure for finding the percentile rank based on the normal curve
12.11 The percentile rank based on the normal curve
12.12 The percentile rank based on a normal curve will accurately describe the position of a score within a hypothetical population given
12.13 The percentile rank based on the sample compared to the percentile rank based on the normal curve
12.14 How do percentile ranks have a different meaning depending upon whether it occurs in the middle of the distribution or the tails of a normal distribution.
12.15 A percentile rank transformation given that the underlying distribution is a normal curve will
12.16 Some pretest and posttest scores are given below for four students. Assuming that all scores were normally distributed, which student made the greatest improvement?
13.1 If a set of raw scores is positively skewed, the set of z scores derived from them will be
13.2 In a linear transformation of the form X"=45.7-3*X, the value of the slope is
13.3 In a linear transformation of the form X"=45.7-3*X, the value of the additive component is
13.4 In a linear transformation of the form X"=45.7-3*X, when the raw score is 10, the transformed score is
13.5 The additive component of a linear transformation
13.6 Changes in the additive component in a linear transformation will be reflected in changes to the ______ relative to the original distribution.
13.7 Changes in the multiplicative component in a linear transformation will be reflected in changes to the ______ relative to the original distribution.
13.8 In a multiplicative transformation, the value of the intercept is
13.9 In a multiplicative transformation that transforms inches to feet, the value of the multiplicative component would be
13.10 A transformation that transforms inches to feet would be a(n)
13.11 Changes in the multiplicative component in a linear transformation will be reflected in changes to the ______ relative to the original distribution.
13.12 In a linear transformation of the form X"=45.7-3*X, if the mean of the raw scores was 20, then the mean of the transformed scores would be
13.13 The formula for finding the multiplicative component given that the means and standard deviations of both the original and transformed distributions are know can best be described as
13.14 A given distribution with a given mean and standard deviation may be transformed using a linear transformation into another distribution
13.15 A "capital T" score transformation is a linear transformation with a transformed mean of
13.16 What transformation has the advantage of always being between 1 and 100
13.17 A standard score transformation is a linear transformation with a transformed mean of
13.18 Standard score or z-scores are a special case of a(n)
13.19 A standard score or z-score may be interpreted as
13.20 The computational formula for finding a z-score may be described by
13.21 One of the values of converting raw scores to standard scores is that standard scores make it possible to
13.22 Linear transformations are preferred by statisticians because
14.1 The purpose of regression models in statistics is
14.2 Before a regression model can be constructed
14.3 The goal in the regression procedure is to create a model where the predicted and observed values of the variable to be predicted are
14.6 The following equation is generally not used as a criteria for the fit of a model to data because
14.7 In predicting Y from X, the method of least squares locates the line in a position such that the sum of squares of the errors of estimate is
14.8 In order to construct an accurate regression model it is
14.9 The following algebraic expression can best be described in words as
14.10 In creating a regression model to predict Y from X
14.11 The goal of regression is to select the parameters of the model
14.12 When does the statistician stop the random search for optimal estimates for regression model parameters.
14.13 The result of minimizing the sum of squared residuals with respect to the regression parameters is
14.14 Many statistical calculators
14.15 Given the following regression model, what would be the predicted value of Y given X was equal to 12.5?
14.16 A scatter plot _____.
14.17 Which of the following scatter plots would have the best-fitting regression line?
14.18 Which of the following scatter plots would have a regression line with a slope near zero?
14.19 How many of the following regression lines would definitely have a negative slope?
14.20 A regression line appears on a scatter plot as
14.21 In predicting Y from X, the regression line is laid down so that the squared discrepancies between points and the line are minimized
14.22 The standard error of estimate is a measure of
14.23 The definitional and computational formula for the standard error of estimate
14.24 The definitional formula for the standard error of estimate can be described in words as
14.25 The degrees of freedom for the definitional formula for the standard error of estimate is N-2 because
14.26 The computational formula for the standard error of estimate is easier because
14.27 The standard error of estimate equals zero (Sy.x=0) when:
14.28 As the degree of relationship between two variables increases, the standard error of estimate
14.29 A conditional distribution is
14.30 In a conditional distribution, the value of mu is often estimated by
14.31 The standard error of estimate is often used as an estimate of
14.32 To calculate a 95 percent confidence interval in regression analysis, use the normal curve area program and set the value of sigma to
14.33 When we use the standard error of estimate and establish a confidence interval we assume that
14.34 If the value of the standard error of estimate underestimated the true value of sigma for a given conditional distribution
14.35 Ninety-five percent, rather than some other percentage, confidence intervals are usually found because of
14.36 The value of the slope of the regression line in the following table is
14.37 The value of the correlation coefficient in the following table is
14.38 The value of the standard error of estimate in the following table is
15.1 Measures of correlation are conventionally defined to take values ranging from
15.2 The sign (plus or minus) of a correlation coefficient indicates
15.3 In a negative relationship _____.
15.4 The Pearson r measures the
15.5 The correlation between college entrance exam grades and scholastic achievement was found to be -1.08. On the basis of this you would tell the university that
15.6 If most of the low scores on one test scored high on another test, then the two tests would be
15.7 Which if the following correlation coefficients show the greatest degree of relationship?
15.8 The closer the points on a scatter diagram fall to the regression line, the _____ between the scores.
15.9 If a strict flat tax was imposed and everyone paid about the same percent of their income in taxes, the value of r between income and taxes would be nearest to
15.10 If the standing in X is of no help in predicting standing in Y, then r is
15.11 In correlational analysis, when the points scatter widely about the regression line, this means that the correlation is:
15.12 If the correlation between two sets of scores is 0 and one had to predict the value of Y for any given value of X, the best prediction of Y would be ____.
15.13 Given that the relationship between GPA and IQ is stronger in high school than in college, the dot cluster for high school (compared to that for college) should
15.14 Which of the following scatter plots illustrates a fairly high positive correlation coefficient?
15.15 Which of the following scatter plots illustrates a moderate negative correlation coefficient?
15.16 Which of the following scatter plots illustrates a nearly zero correlation coefficient?
15.17 Which of the following scatter plots illustrates a correlation coefficient near -1.0?
15.18 How many correlation coefficients illustrated in the graphs below would be clearly negative?
15.19 The correlation coefficient is the
15.20 If ten points are added to each score on a test, the correlation coefficient of any variable with the transformed test score
15.21 In a given group, the correlation between height measured in feet and weight measured in pounds is .758. Which of the following would alter the value of r?
15.22 In the variance interpretation of the correlation coefficient, the variance of the predicted variable is partitioned into
15.23 Suppose that college grade-point average and the verbal portion of an IQ test had a correlation of .40. What proportion of the variance does verbal IQ predict in college GPA?
15.24 The greater the amount of predictable variability in Y, the smaller the
15.25 Which of the following is NOT a correct definition of Pearson r?
15.26 In general, the standard error of estimate will increase
15.27 The correlation coefficient can be computed
15.28 How many different correlation coefficients would need to be computed if the correlation matrix had 9 variables?
15.29 If the correlation between a form-board test and success on the job was .324, the correlation between success on the job and the form-board test would be
15.30 From the correlation matrix presented below, in general males indicated higher scores on the health inventory in college than females.
15.31 From the correlation matrix presented below, in general smokers indicated higher scores on the health inventory in college than females.
15.32 From the correlation matrix presented below, in general higher income is related to greater life satisfaction seven years after college.
15.33 From the correlation matrix presented below, in general all smokers indicated lower scores on the health inventory in college than females.
15.34 From the correlation matrix presented below, in general the relationship between life satisfaction and being married was higher in college than seven years later.
15.35 From the correlation matrix presented below, in general smokers indicated higher life satisfaction both in college and seven years later.
15.36 From the correlation matrix presented below, in general females indicated higher incomes in college.
15.37 For which one of the following relationships could not a value of r be meaningfully interpreted?
15.38 Including a religion variable coded as (1=Protestant, 2=Catholic, 3=Jewish, 4=Other) in a correlation matrix
15.39 Including a sex variable coded as (1=Male, 2=Female) in a correlation matrix
15.40 Changing the coding of the sex variable in a correlation matrix from (1=Female, 2=Male) to (0=Male, 1=Female) would
15.41 The correlation coefficient ____ when an outlier is present in the data.
15.42 The correlation between scores on a neuroticism test and scores on an anxiety test is high and positive; therefore
15.43 In a study of auto drivers, it was found that a lower frequency of accidents was associated with more years of experience and with greater age of the driver, and that amount of experience and age were positively correlated. In explaining this finding, it might be that
15.44 A high positive correlation between the ages of teachers and the average grades of their students means that
15.45 Correlation and regression differ in that
16.1 In the opinion of the author of the text
16.2 When doing an hypothesis test
16.3 The purpose of hypothesis testing is
16.4 If a researcher was comparing two groups with respect to test performance, chance factors would not include
16.5 The purpose of hypothesis testing is
16.6 The purpose of hypothesis testing is to:
16.7 In real life, most decisions are made
16.8 Which of the following is an example of an effect.
16.9 Which of the following is not a statistic that could measure the size of an effect.
16.10 In classical hypothesis testing, the researcher almosts always
16.11 In hypothesis testing the sampling distribution
16.12 In classical hypothesis testing, if the results of a single experiment are likely given the model under the null hypothesis, then the researcher must decide
16.13 In classical hypothesis testing, if the results of a single experiment are unlikely given the model under the null hypothesis, then the researcher must decide
16.14 In classical hypothesis testing, the null hypothesis
16.15 The hypothesis testing procedure requires that the experimenters:
16.16 The Sampling Distribution is used as a model of what would happen if
16.18 The true probability of heads for any given coin
16.19 The probability of a certain event is
16.20 A conditional event is characterized by the word
16.21 Two events are said to be independent if
16.22 If two events are independent, the probability of the joint event can be found by ____ the separate probabilities.
16.24 The probability that results from an application of Bayes Rule is called the ______ probability.
16.26 In most cases the probability of heads when flipping a coin is set to .50 because
16.27 Probability estimates based on the relative frequency of an event will be more stable
16.28 Classical hypothesis testing generally uses the _____ method of estimating probabilities.
16.29 People often have a difficult time correctly estimating probabilities, especially
16.30 A male who is very shy and withdrawn, invariably helpful, but with little interest in people, or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail. The occupation that this person most likely works is as a
16.31 Probability models are most useful is the person using them has the opportunity to participate
16.32 Probability and utility theory
17.1 The sample distribution
17.2 The sample distribution
17.3 Parameters in a probability model can be estimated by
17.4 Statistics calculated on a sample of observations are:
17.5 A sampling distribution
17.6 A theoretical distribution of sample medians would be called
17.7 When the expected value of a statistic equals a population parameter, the statistic is called
17.8 The standard error of the mean is ____ the standard error of the median.
17.9 In general, the smaller the standard error of a statistic
17.10 As sample size (N) increases the sample standard deviation (sx)
17.11 Which of the following does not belong:
17.12 The mean of a smaller size sample (N) could be closer to the true parameter (() of a population model than that of the mean of a larger sample:
17.13 The standard error of the mean changes as a result of changes in the:
17.14 The Central Limit Theorem states that the sampling distribution of the mean approaches a normal distribution
17.15 Which of the following statements is most correct with regard to sample size and the sampling distribution of the mean?
17.16 The standard error of the mean is the
17.17 The Central Limit Theorem
17.18 The Central Limit Theorem is a theoretical justification for
18.1 When testing an hypothesis about a single mean, the statistician compares the observed sample mean with
18.2 Before continuing with the head start program,
18.3 If a sample mean of 103 was obtained with a sample size of 25, mu=100, sigma=15, alpha=.05, and a two-tailed test, the decision about the hypothesis test would be to
18.4 If a sample mean of 103 was obtained with a sample size of 100, mu=100, sigma=15, alpha=.05, and a two-tailed test, the decision about the hypothesis test would be to
18.5 In the Head Start study described in the text, the value of alpha is
18.6 The probability of the results of the study given the null hypothesis model is true is called the
18.7 The default value for alpha is generally set to
18.8 The researcher who completes a study with a larger sample size
18.9 The question of practical significance
18.10 Practical significance is a decision about
19.1 Before performing a statistical analysis of the data, the experimental design must be determined because
19.2 Each subject will have more than one score on the dependent measure in
19.3 A study that required subjects to drive both when sober and after consuming three ounces of alcohol would be employing a
19.4 Crossed designs have the advantage of
19.5 Carry-over effects are a problem in
19.6 An advantage of crossed designs is
19.7 Each subject will have a single score on the dependent measure in
19.8 A study that required subjects to drive either when sober or after consuming three ounces of alcohol would be employing a
19.9 A study of the effect of including statistical calculators or not on the theoretical understanding of statistics in an introductory statistics course during a whole semester would necessarily require
19.10 Including pre-existing conditions such a sex of the subject as a factor in an experimental design necessarily requires
19.11 The distinguishing feature of a crossed design is
19.12 The sign of the mean difference score when employing a crossed t-test will
19.13 The larger the mean difference score relative to the standard deviation of the difference score in a crossed design
19.14 The larger the standard deviation of the difference scores in a crossed design
19.15 The standard error of the mean difference scores is estimated in a crossed t-test by
19.16 The degrees of freedom for a crossed t-test is
19.17 Given the following output tables from the SPSS the life satisfaction variables relative to the income variables
19.18 Given the following output tables from the SPSS, both pairs of tested variables
19.19 Given the following output tables from the SPSS, a correct interpretation is
19.20 A crossed design is analyzed using SPSS as a
19.21 The result of a test of the mean difference score is said to be statistically significant when:
19.22 The appropriate hypothesis test when there are two groups and different subjects are used in each group is
19.23 In a nested t-test, the size of the effects can be seen in
19.24 In the case of a nested t-test, the sampling distribution used as a model of what the world would look like given that there were no effects is a distribution of
19.25 The estimate of the standard error of the difference between means is a function of
19.26 When there are equal numbers of observations in the two groups in a nested t-test
19.27 The degrees of freedom in a nested t-test is
19.28 A nested design is analyzed using SPSS as a
19.29 Which SPSS command would be most appropriate to show relationships between variables if one variable was clearly nominal-categorical with fewer than 5 levels and the other could be considered an interval scale?
19.30 In the SPSS output for an independent samples t-test, if the significance level for Levenes Test for Equality of Variances is very small
19.31 In the following tables containing SPSS output from an independent samples t-test, which of the following is the most correct interpretation
19.32 In the following tables containing SPSS output from an independent samples t-test, which of the following is the most correct interpretation
19.33 In the following tables containing SPSS output from an independent samples t-test the largest effect was observed for the variable
19.34 In the following tables containing SPSS output from an independent samples t-test, the results of the health inventory could be interpreted as
19.36 As the number of degrees of freedom increases the distribution of t
19.37 If we employ the normal curve for testing hypothesis when N is small and Sigma is unknown, we
19.38 In the t distribution with mu=0 and sigma=1, the 95% confidence interval will
19.39 Which of the distributions listed is most appropriate for testing hypotheses when the population parameter sigma is unknown?
19.40 The number of degrees of freedom is always:
19.41 The following graph presents three t-distributions. Which is most likely the normal distribution?
19.42 The following graph presents three t-distributions. Which has the smallest degrees of freedom?
19.43 The following graph presents three t-distributions. Which has the largest value for mu?
19.44 For most practical purposes, the t-distribution and the normal distribution can be interchanged when
19.45 The smaller the sample size (N) the
19.46 The exact significance level of a two-tailed t-test when mu=0, sigma=5, df=6, and the value=6.34 is
19.47 If a one-tailed t-test is selected
19.48 When a two-tailed t-test is selected
19.49 When a one-tailed t-test is selected
19.50 The direction (positive or negative) selected in a one-tailed t-test is
19.51 Although a researcher anticipates a difference between the means of two sets of scores, he/she does not know in which direction the difference might occur. He/she should:
19.52 The selection of a one or two-tailed t-test
19.53 A good rule of thumb when selecting a one or two-tailed t-test is
19.54 A one-tailed t-test
19.55 If alpha=.05, df=12, mu=0, sigma=3.47, and the score=-6.5, significance would be found in
19.56 If alpha=.01, df=12, mu=0, sigma=3.47, and the score=-6.5, significance would be found in
19.57 If alpha=.05, df=12, mu=0, sigma=3.47, and the score=-8.5, significance would be found in
20.1 When doing hypothesis testing in the real world
20.2 The scientific method requires
20.3 The probability of rejecting the null hypothesis whether it is true or not
20.4 The probability of correctly retaining the null hypothesis is a function of
20.5 The probability of a type II error is ____ related to the probability of a type I error.
20.6 Decreasing the probability of a Type I error (alpha) _____ the probability of a Type II error (beta).
20.7 Keeping the size of effects and alpha constant, decreasing the error variance will _____ the size of beta.
20.8 The reason most researchers do not set the value of alpha extremely low (less the .001) is
20.9 The probability of correctly rejecting the null hypothesis is a function of
20.10 The size of beta decreases as the size of
20.11 When the cost of a not rejecting the null when it is false is high with respect to rejecting the null when it is true, the hypothesis tester will
20.12 The results of a test of the difference between two sample means are said to be statistically significant when:
20.13 When an experimenter selects a particular level of risk (a-alpha)
20.14 The level of significance of an hypothesis test is determined by
20.15 If the null hypothesis is rejected as being false,
20.16 When an experiment is performed in real life
20.17 In reaching a decision about the null hypothesis, a type I error occurs when:
20.18 Which of the following does not belong:
20.19 If the cost of a Type II error is high relative to the cost of a Type I error, the value of alpha should be set
21.1 One disadvantage of performing multiple t-tests rather than ANOVA is
21.2 The experiment-wise error rate
21.3 The probability of committing at least one type I error in an analysis
21.4 The effects in an ANOVA are manifested in
21.5 When an ANOVA returns significant results
21.6 When a statistician examines an ANOVA table, the column examined first and of greatest interest is the one labeled
21.7 If the number found in the Sig. column is less than the critical value of alpha set by the experimenter,
21.8 If an ANOVA results in statistical significance
21.9 If an ANOVA is found to be not statistically significant
21.10 The following tables show that income seven years after college
21.11 The following tables show that life satisfaction seven years after college
21.12 The following tables show that life satisfaction while in college
21.13 When the null hypothesis is true, both the mean squares within and the mean squares between are estimates of
21.14 The mean squares within
21.15 The mean squares between
21.16 The greater the difference between the sample means
21.17 The following equation
21.18 The F-ratio will increase
21.19 The F-ratio can be though of as a measure of
21.20 How large the F-ratio must be to decide that the effects are real is answered by comparing the observed F-ratio with
21.21 When the F-ratio (Fobs) is less than 1.00
21.22 With a negative F-ratio (Fobs).
21.23 The theoretical distribution of the F-ratio will vary as a function of:
21.24 When there are real effects, in ANOVA, they are assumed to be _____ for each group.
21.25 When there are real effects in ANOVA
21.26 When there are no real effects in ANOVA
21.27 The ANOVA procedure and nested t-test procedure will
21.28 Which of the following significance tests was significant
21.29 The relationship between Socio Economic Status of Parents and Gender
21.30 The difference in income between males and females was significantly greater seven years after college than in college.
21.31 Based on the analysis given below, what is a correct interpretation.
22.1 Chi square is most appropriate with ________ data.
22.2 Effects in a contingency table occur when
22.3 When a Chi-squared analysis of a contingency table is found to be statistically significant
22.4 The command in SPSS that computes a contingency table is
22.5 The expected cell frequency found when computing a chi-squared statistic is
22.6 The larger the difference between the observed and expected cell frequencies in an analysis of a contingency table, relative to the expected cell frequency
22.7 The larger the chi-squared statistic for a given contingency table
22.8 If a statistician returned chi-squared value of -11.326 for a contingency table
22.9 The theoretical chi-squared distribution is characterized by the parameter(s)
22.10 In the theoretical chi-squared distribution the greater the degrees of freedom
22.11 In the following SPSS output
22.12 In the following SPSS output
22.13 In the following SPSS output
22.14 In the following SPSS output the cells in which contingency table would most likely be interpreted
22.15 In the following SPSS output which analysis would a statistician be least secure in interpreting the chi-squared value
23.1 To test whether a linear relationship exists between two variables, one could employ an hypothesis test of the
23.2 The theoretical mean of the sampling distribution of correlation coefficients when the null hypothesis is true is
23.3 The critical values of the sampling distribution of correlation coefficients when the null hypothesis is true is
23.4 The degrees of freedom to test whether a correlation coefficient is different from zero is
23.5 From the correlation matrix presented below, how many different correlation coefficients are statistically significant with alpha = .05?.
23.6 From the correlation matrix presented below, the correlation coefficient between life satisfaction in college and income in college is statistically significant with alpha = .05?.