Notes+page

 Communication of Results

Process of Statistical investigation (above)
What is Statistics(with the big S)?

Statistics is a process for formulating questions, collecting data, and telling the story related to the questions. The collected data is organized into a comprehensive form in order to be analyzed and to show similarities and differences within groups to answer original questions.

By analysing the data we can draw conclusions and make further predictions about the subject matter of the question.

-Tell the data story, for example using graphs to help find the mean, median, and mode.


 * -**The collected data is categorized into a comprehensive form to be analyzed, this allows similarities and differences to be seen within the categories to help explain the data results as well as make predictions.

NCTM recommendations:
There are essentially two types of variables or data types: categorical and numerical.

The equal sign should mean equal. If approximate, then we need to show that the answer is approximate.

When writing an approximate solution, we need to display to the hundredths.

Percent is a fractional part of 100, where 100 is the whole. So percents are chunks of a specific whole.

Not just giving an answer to our buddies but helping them focus a solution. Tough.

Don't be mislead by presentation of statistics. Analyze.

"We use the word **percent** as part of a numerical expression (e.g. Only two percent of the students failed). We use the word **percentage** to suggest a portion (e.g. The percentage of students who fail has decreased)."
 * __Percent vs Percentage__**

[] I tried clicking on this and it doesn't work. Could you double check the address? Is there a www missing?


 * __A few words about Range__**

Range is measure of variability: -it can be high to low -it can be max. to min. -it's how the data is spread

So far.. Numerical Data can be displayed through... So far... Categorical Data can be displayed through...  
 * Dot Plot
 * Stem and Leaf (more for organizing the data than being a true display)
 * Histogram
 * Don't forget Real graph
 * Picture graph
 * Bar graph
 * Stacked Bar graph
 * Back-to-back bar graph
 * Multiple bar graph (for comparing several samples)
 * Circle graph

__**Features of a good Histogram**: __
  __Histogram:__ is a graphical display of the information that is found in a grouped frequency table of that data.

__//**Used:**//__ to represent frequency distributions **//__Histogram Must Have__//**: -equal widths or lengths of intervals (must be all the same) -frequency represented on the vertical line -range of response variable labeled on the horizontal line **//__Histogram Shows__//**: -data clusters -intervals for upper and lower extremes -gaps in the data -shape of frequency distribution (symmetrical or skewed or no true "shape") -variability (the spread of the data)

__Frequency Tables and Histograms__ pg 55 -the use of stem-and-leaf and dot plots is that each individual data element is represented in the display with little loss of information -information loss takes place only when data are renamed using a truncating strategy or a rounding strategy BUT -both frequency tables and histograms information will be lost because intervals are being used to condense the data

Bell curve is a normal distribution of the data

the data can also be; skewed left or skewed right

Stem-and-Leaf example: 15/8 represents 158 --using this example: stem represents the "tens" place  leaf represents the "ones" place  example: 38I00 represents 38.00 stem represents the "tens" Careful here; that would mean we have 38 tens! Is that true??? leaf represents the "decimal" What place value is "decimal"?? What does the "00" really represent? What is it "counting" for us in the number? --when representing a number on a stem and leaf plot that goes beyond one place value on the leaf side; make sure a common<span style="color: rgb(0, 0, 255);"> comma is present (the comma is to show that the number goes to tenths place value on the leaf side) you DO NOT need a comma when dealing with the ones place on the leaf side<span style="color: rgb(255, 0, 0);"> when only dealing with single digits regardless of place value -Provide a key to show how the data was renamed
 * this is only necessary when dealing with tenths place value and beyond

---Disadvantages of a stem-and-leaf plot: 1) Not visually appealing <span style="color: rgb(255, 0, 0);">careful; this is a personal opinion vs a statistical reason. 2) Does not easily indicate measures of center for large data sets <span style="color: rgb(255, 0, 0);">we don't know what measures of center are!!! (yet)

--The disadvantages of a histogram are: 1. the raw data gets lost- you can't see the individual units of data, you can only see where the data falls between. 2. You can't figure out the exact max and min (goes along w/#1) You can only see the interval that the extremes fall in. 3. An example of a disad would be as teachers if we were showing data to parents about their children, they would not be able to see there individual child. (Like in a dot plot they can point to a dot and know that is their child.

--Num/Num:
 * What type of display do you use for showing associations between . . .**
 * Type of display:
 * Scatterplot
 * Example of two num variables
 * Height
 * Arm Span

- Dot Plot <span style="color: rgb(255, 0, 0);">is this the only display that can be used? - number of french fries - McDonald's and Burger King (pg. 48) <span style="color: rgb(255, 0, 0);">restaurant (is the second variable; you've named specific outcomes!)
 * The type of display used for showing associations between Num/Cat:**
 * Type of display
 * Example of num/cat variables

Advantages of Dot Plot graph Disadvantages of Dot Plot grahs
 * shows numerical data in a very simplistic way
 * ranges start from the lowest to highest data value
 * interpretation of data can be read easily <span style="color: rgb(255, 0, 0);">personal opinion; be careful
 * Shows bumps, clumps and holes

__Comparing vs. Association__ (pg. 66, 72) -You can compare **multiple groups using the same variable** or compare **two variables from the same sample**. You are looking for **differences**. -Comparisons are described by their differences/changes in values, whether it is the same variable or 2 variables. -Associations can be investigated between **2 or more variables**. <span style="color: rgb(255, 0, 0);">Double check this! Usually only 2. You are then looking for a **relationship** of corresponding values between one variable and another. -Associations can be described by strength and direction.

Student need to be more actively engaged in the construction process of graphical display, rather than just reading the information for them.<span style="color: rgb(255, 0, 0);">This is almost a quote! Be careful of plagiarism!! Either insert quotes and provide a reference or rephrase. Some ways for us to help students is by using softwares that contain prompts that can help guide them to constructing a display by hand. Most importantly as teachers we need to develop a strong statistical understanding to allow us to create alternative ways of introducing new displays and having the ability to answer any questions for helping students develop the same understanding.
 * Page 94-95 Summary - LC

Looks at clumps, bumps, holes, shape, min and max, range, and associations.
 * __Analyze:__**

Ask the question WHY? It is going beyond what is given in the data and giving it meaning or making sense of the data.
 * __Interpret:__**

These aren't complete definitions, but i wanted to jot down some general ideas we have talked about in class w/mean, median and mode. (Feel free to add to them!)


 * __Mode__:** The data that occurs most often.


 * __Median__**: The data that sits in the center.

__**Mean:**__
 * "evened them out"
 * "equally distributing"
 * "fairly sharing

__**Standard Deviation:**__ Distances: can not be negative (absolute values) Deviation: can be negative and positive, because eventually it will be squared rooted <span style="color: rgb(255, 0, 0);">(just squared here, not squared rooted) which will make all values positive Variance: the mean average of the squared deviation for a data set


 * collecting data is experimental

__**Trial:**__ one run of an experiment<span style="color: rgb(239, 37, 37);"> (the trial will depend upon the question being asked!) example: 1. 2 (this is one trial) <span style="color: rgb(241, 59, 59);">(not clear here; is getting a 2 one trial? But you don't start trial number 2 until you have read 3 numbers; I'm confused. Which is it?) 2. 2 3. 1 Trail Number 2: 1. 1 2. 3 3. 5