Moo Cows
One day I
was sitting in a statistics class being pressured to figure out a project to
analyze. Then cows came to mind. We discarded this idea, and then thought about
doing something involving local pudding plants. After the realization that we
don’t have many pudding plants, we came back to the cow idea. We planned our
question to state, “Are the number of black spots on baby calves normally
distributed?”
We chose to do a population census
to show the true number of spots a calf has. With forty-one calves on May 7,
2012 at the Perazzo Brothers’ Dairy we counted the number of black spots (that
were at least three centimeters in diameter) on the calf in every pen
(individual pens have individual calves).
Our rough
data looked like this:
Pen Number
|
Calf ID Number
|
Number of Spots
|
1
|
2091
|
4
|
2
|
2090
|
2
|
3
|
2089
|
1
|
4
|
2088
|
1
|
5
|
2087
|
1
|
6
|
2086
|
48
|
7
|
2084
|
6
|
8
|
2082
|
30
|
9
|
2081
|
1
|
10
|
2080
|
1
|
11
|
2079
|
7
|
12
|
2078
|
7
|
13
|
2077
|
8
|
14
|
2076
|
9
|
15
|
2074
|
3
|
16
|
2073
|
4
|
17
|
2072
|
3
|
18
|
2071
|
2
|
19
|
2070
|
1
|
20
|
2069
|
2
|
21
|
2068
|
15
|
22
|
2065
|
1
|
23
|
2064
|
1
|
24
|
2063
|
43
|
25
|
2062
|
6
|
26
|
2061
|
4
|
27
|
2060
|
1
|
28
|
18025
|
5
|
29
|
2057
|
1
|
30
|
2056
|
1
|
31
|
2055
|
6
|
32
|
2054
|
28
|
33
|
2053
|
6
|
34
|
2052
|
3
|
35
|
2050
|
1
|
36
|
2049
|
13
|
37
|
Untagged 1
|
8
|
38
|
Untagged 2
|
2
|
39
|
Untagged 3
|
10
|
40
|
Untagged 4
|
9
|
41
|
Untagged 5
|
14
|
Our
5-Number Summary looks as follows:
Minimum – 1
First Quartile – 1
Median – 4
Third Quartile –
8.5
Maximum - 48
Now, on a side note we must add in
a word about our bias. And don’t laugh because if you were collecting data with
us, you would understand, but there was bias from us as data collectors. There
were some calves that we loved more than the others and we spent more time
with. For example, Calf #2077 we had nicknamed ‘The Devil’ and wanted to spend
the least amount of time possible with her so we probably weren’t as thorough
with her as we were with Calf Untagged 5, also known as Casper, who we loved
spending time with. This bias could theoretically be fixed with some sort of
blinding but we are confident that whoever went and took data would have the
same conflict we did and naturally be drawn to some particular calves.
Then to organize our data, we put
it into a histogram using ranges with a five spot difference between two of the
separate bars (see below).
According
to the graph above, we conclude to say that the distribution of the number of
spots on baby calves is definitely not normal. The graph shows that it is
unimodal and skewed to the right. It does not have a normal curve or is
normally distributed.
Noticing that there were four
particular cows that were skewing the data, we took them out of the graph and
had steak and milkshakes for dinner. It was delicious. No, but seriously. And
we noted them as outliers. We made a new graph and analyzed the new
distribution.
As one can
see, it is still unimodal and skewed to the right (as we figured it probably
would be… the shape shouldn’t change or anything. Although it would have been
nice if it had become “normal”… then at least we would have somewhere exciting
to go with this project). We also changed the ranges that we used from going
1-5, 6-10, etc. to 1-3, 4-6, and so on. This made the distribution slightly
different, but still the same basic idea. The bars on the graph also got wider
for some reason. Guess that they enjoyed the steaks and milkshakes, too.
Then, noticing
that we had made a fatal error, we set out to fix that right away.
“What fatal
error?” you might ask. We had forgotten to check the conditions for a normal
model. Yeah, we’re pretty ridiculous.
Normal
models have to be independent, random, and have a large enough number of
experimental units. Since none of the calves were siblings we didn’t need to
worry about the calves being independent. And because we were using all of the
calves on a huge dairy, we didn’t figure that a big enough number was a
problem. But the thing we neglected was that we needed a random sample. Nothing
about our all-in-one census was random or a sample.
So in order for us to complete this
condition we decided to use a random number generator to pick out fifteen
random numbers and then graph those results according to the numbers on the
first graph (the rough data list). We did this ten times, getting ten different
sets of random numbers to compare to the Normal model.
The
distributions of each of our ten sets of randomly assigned pen numbers were
unimodal (except for graph #5 which was bimodal) and were skewed to the right,
just like the distribution of our graph for all of the data. These graphs can
be found on the last page. We didn’t want to go and stick ten graphs right in
the middle of all of this writing. It might be kind of distracting.
Even though
it didn’t qualify (probably because our “n” was not large enough), we designed
a normal curve off of our total data set anyways. Our mean was 4.5946 and our
standard deviation was 3.9753. Based on these values, our normal curve ended up
looking like this:
By looking at this, one can see
that, based on the Empirical rule, 68% of the calves in our sample had between
.6193 and 8.5699 spots. Since earlier we had taken an interest in how the 1-2
spot range always had the most calves in it in our ten random samples, we
conducted a test to find the probability that a calf from our sample had 1 or 2
spots. The normal curve from this test is on the next page, because there
supposedly not enough room for it on this page.
After plugging the numbers into our
handy-dandy online calculator (try finding a normal probability when you’ve
already turned your calculator into the teacher, yeah it’s sort of difficult)
we discovered that the probability of a calf from our sample having one to two
spots was .074. This is a small probability, so I guess the calves at the Perazzo
Brothers’ Dairy are just boss or something, because all of our graphs had the
1-2 with the most (except for graph number five and its stupid bimodal-ness).
In conclusion, the distribution of
spots on calves is not normal; at least not for our sample. In all the samples
we conducted they we severely skewed to the right.
It might’ve been normal if we had a larger
sample number. But, since we didn’t, our conclusion is limited to the calves at
the Perazzo Brothers’ Dairy.
If we wanted to be able to expand
the limitations of our sampling, we could sample more dairies in Fallon or
other places to get a larger sample size. If someone were to replicate the same
idea for this project, we suggest using other dairies and collecting data for a
larger sample size. And in this way, you could apply this study to a larger
region than just at the Perazzo Brothers’ Dairy.
All in all, this sample was fun to
do, but disappointing with the results.