Since the standard deviation depends on every observation in the data, it is quite susceptible to outliers. If the data contain a lower outlier, the standard deviation will be low, and if the data contain an upper outlier, the standard deviation will be high.
A measure of variance based on the middle part of the data is the interquartile range. It is not susceptible to outliers because it is dependent on the middle part of the data.
As outliers are the extreme value in the data, the range of data is the difference between the maximum and minimum observations. As a result, if there is an outlier in the data, the range's value can vary significantly. If there is an outlier in the data, the value of the range can be completely changed.
The range is therefore more susceptible to outliers than the standard deviation and interquartile range.
Chapter 4The ____ is equal to the square root of the ______.Standard deviation; variance
The mean is calculated by:Summing all the scores in a data set and then dividing by the total number ofscores.
The mean of the population is represented by the symbol ____, and the mean of the sample is represented byM
Get answer to your question and much more
The variance calculated on 22 scores is equal to 28.93. What is the standard deviation?5.38 units
Katrina observes and records the number of people who purchase a meal at the school cafeteria during eachoperating hour. The cafeteria is open from 6:00 A.M. to 9:00 P.M. and students typically eat breakfast, lunch,
Get answer to your question and much more
The ____ is affected by outliers because it takes the actual value of each data point into consideration.mean
Get answer to your question and much more
Figure: Years of Education
Get answer to your question and much more
The Lee family is looking to buy a house in one of two suburban areas just outside a major city. One of theirtop priorities is air quality. One suburb advertises the use of hybrid cars and solar panels, while the otherfocuses on its convenient bus routes and availability of SUV dealerships. The Lee's intend to sample air qualityacross the two areas to help them make a decision. Is the mean or median the better measure to use fordeciding which area has better air quality? (Hint:These populations are skewed.)
Get answer to your question and much more
Most students struggled on Professor Horrible's dastardly difficult statistics exam, but several students had
Get answer to your question and much more
The median is preferred over the mean for _____ distributions.skewed
The dean of a local college needs to drop one course from the art program. She decides to pick the course withthe lowest average enrollment rate from the previous four semesters. The enrollments of three courses she isconsidering are Photography—30, 20, 12, and 22; Film Editing—11, 29, 27, and 29; and Abstract Art—18, 22,
Get answer to your question and much more
The median is the measure of central tendency that conveys the mathematical center of the data.False
An outlier is a data point that is distant from the other observations. For instance, in a data set of #{1,2,2,3,26}#, 26 is an outlier. There is a formula to determine the range of what isn't an outlier, but just because a number doesn't fall in that range doesnt necessarily make it an outlier, as there may be other factors to consider.
The #color(red)(median)# is the middle number of a set of numerically ordered numbers. If the number of values in the set is odd, then the #color(red)(median)# is the central number, with equal amounts of data on both its left and its right. If the set has an even number of values, then the #color(red)(median)# is the average of the two central numbers. For example, in the set of #{1,2,3,4,5,6,7,8}#
, there is an even amount of numbers, therefore we must find the mean of the two central numbers, which results in
#(5+4)/2=4.5#, the #color(red)(median)# .
The #color(green)("range")##r# is the distance from the highest value to the lowest value, and is calculated as #r=h-l#, where #h# is the highest value, and #l# is the lowest value. So if we have a set of #{52,54,56,58,60}#, we get #r=60-52=8#, so the #color(green)("range")# is 8.
Given what we now know, it is correct to say that an outlier will affect the #color(green)(ran)##color(green)(g)##color(green)(e)# the most. This is because the #color(red)(median)# is always in the centre of the data and the #color(green)(ran)# #color(green)(g)##color(green)(e)# is always at the ends of the data, and since the outlier is always an extreme, it will always be closer to the #color(green)(ran)##color(green)(g)##color(green)(e)# then the #color(red)(median)#.
For example, take the set #{1,2,3,4,100}#, with 100 as the outlier. The #color(green)(ran)# #color(green)(g)##color(green)(e)# of this set is #r=100-1=99#, while the #color(red)(median)# is 3. If we take the outlier 100 out, so the set is now #{1,2,3,4}#, the #color(green)(ran)##color(green)(g)##color(green)(e)# becomes #4-1=3#, while the #color(red)(median)# becomes #(3+2)/2=2.5#. Evidently, it was the #color(green)(ran)##color(green)(g)##color(green)(e)# which was affected the most.
//mathspace.co/learn/world-of-maths/univariate-data/effects-of-outliers-12017/things-out-of-the-norm-601/
I hope I helped!