In the previous exercise you calculated the mean weight of the Cambridge crew and then the mean weight of the Oxford crew. You surely noticed that writing the code to calculate Oxford's mean was pretty easy since it used the same procedure as the code to calculate Cambridge's mean.
The process for calculating the mean is:
When working with statistics, it's common to find yourself repeating a calculation many times. Statisticians don't like to repeat themselves, so, to save yourself time, you can put the calculation into a function. When you define a function, it's like a recipe for serving ice cream:
Functions are also very useful for statistics because many of the advanced analysis tools are built on much simpler tools, such as mean, sum, and the number of data points in a dataset.
Before you can use your own function, you must define the function. Defining a function is similar to how dictionaries define words.
You will now define one function, calling it mean(), and the use mean() to calculate stuff. Type this code:
# Lets create a mean() function
def mean(myData):
"""
Input: a list of numerical data. We'll call it 'myData'
Output: a number representing the average of the numbers in myData
Remember that sum() and len() are pre-made functions that
Python gives us. You will learn about other pre-made functions later.
"""
return sum(myData) / len(myData)
# Lets try our function with a simple test.
print "I know the average of 2.0, 2.0, and 5.0 to be 3.0."
print "My function says the average is:"
print mean( [2.0, 2.0, 5.0] )
print
# Lets calculate the rowing means. This time, we will put the lists of
# numbers into variables so we don't get confused as to which
# list of numbers belongs to which team.
cambridgeWeights = [188.5, 183, 194.5, 185, 214, 203.5, 186, 178.5, 109]
oxfordWeights = [186, 184.5, 204, 184.5, 195.5, 202.5, 174, 183, 109.5]
print "The average weight for Cambridge is:"
print mean(cambridgeWeights), "pounds"
print "and the average weight for Oxford is:"
print mean(oxfordWeights), "pounds"
print
print "This is a difference of:"
print mean(cambridgeWeights) - mean(oxfordWeights), "pounds "
Save the program as my-mean-function.py.
After clicking Run, you should get:
I know the average of 2.0, 2.0, and 5.0 to be 3.0. My function says the average is: 3.0 The average weight for Cambridge is: 182.444444444 pounds and the average weight for Oxford is: 180.388888889 pounds This is a difference of: 2.05555555556 pounds
By making your own function, you just saved yourself a lot of work, since you didn't have to calculate by hand the averages.
Statisticians are lazy, so they make the computer do as much work as possible. This means there is more time to get ice cream. Hooray for functions.