Pennsylvania State University Excel Homework 6 Tutorial

The Pennsylvania State Universityexcel Homework 6 Tutorial Written

The Pennsylvania State Universityexcel Homework 6 Tutorial Written by Peter Chamberlain Directions for Making a Scatterplot 1. Select the range by highlighting the two columns you wish to analyze. The independent variable must be located to the left of the dependent variable. If necessary, you may need to move a column in order to make the column on the left the independent variable. To make things easier, you can put the columns next to each other and click on the Column Letter for the independent variable and drag to the Column Letter for the dependent variable. This will highlight both columns. 2. Click Insert and then on the image for the Scatterplot in the chart section (circled below). Click the upper left scatterplot when given the option. 3. Excel will default to titling the chart to the name of your independent variable. Click in the title so that you can edit the title and call the chart Scatterplot. 4. Clicking in the chart will cause the plus sign, paint brush, and filter on the right-hand side to appear. Click on the plus sign and a menu for Chart Elements such as grid lines, titles, axis titles, etc., will appear. Name your X axis after your independent variable and your Y axis after your dependent variable. 5. If your Scatterplot has a Legend or Gridlines, you can remove those by clicking Legend or Gridlines as appropriate from the plus sign and unchecking any checked boxes for those. 0.........000 0.100 0.200 0.300 0.400 0.500 Scatterplot SLG © The Pennsylvania State University Installing the Data Analysis ToolPak (in case it isn’t already installed in your Excel) – see Excel Tutorial 1 Directions for Getting Correlation 1. Open the data set you will need to use for finding correlation. Make sure that your columns are next to each other and that the independent variable column is immediately to the left of the dependent variable column. 2. Highlight both columns by clicking on the Column Letter for the independent variable and dragging to highlight the Column Letter for the dependent variable. 3. Click on Data. 4. Click on Data Analysis. 5. Click on Correlation and click OK. 6. Confirm that the Input Range is the same as the columns you have selected. 7. Check the box for Labels in First Row. 8. Click OK. Directions for Running a Regression using Data Analysis 1. Open the data set you will need to use for running regression. 2. Click on Data. 3. Click on Data Analysis. 4. Click on Regression and click OK. The following dialogue box will appear. © The Pennsylvania State University 5. Choose your input and output ranges. The range can only go as far as you have data and the number of rows in your input range must equal the number of rows in your output range. You will select each range individually. You can start at the top of the column and scroll down or you can start at the top of the column and use CTRL, Shift, and the Down arrow together to get to the bottom of the column. 6. Check the box for labels if your columns have Labels (they probably will). You can ignore the Confidence Level section. 7. Click OK. If you chose the option to have the output on a New Worksheet Ply, it will show your output on that new ply (or tab if you want to call it that). That option is generally preferred for a cleaner look and also because you may have massively large sets of data that you can work with. 8. Do not check the boxes for Residuals and Normal Probability. Checking these boxes will give you a significant amount of additional output that you will not need. Excel Function: STDEV.S Reminder: Functions start with an equals sign and contain arguments. Functions can be typed into any empty cell. STDEV.S is the function to find the standard deviation of a sample. P-Values in Scientific Notation Very small p-values are often written in scientific notation in Excel output. For example, the p-value . would be written as 4.3E-11. Data GP AB R H 2B 3B HR RBI TB BB SO SB BA OBP SLG OPS OWAR St. Louis Cardinals Adron Chambers .....1 Allen Craig .... Edwin Jackson†.....2 Yadier Molina .... Lance Berkman .... Albert Pujols .... David Freese .....7 Jon Jay .....5 Matt Holliday .....7 Skip Schumaker .....6 Nick Punto .....1 Ryan Theriot .....6 Daniel Descalso .....3 Tony Cruz .....1 Rafael Furcal†.....3 Colby Rasmus†.....6 Gerald Laird .....1 Mark Hamilton .....1 Tyler Greene .....6 Andrew Brown .....2 Pete Kozma .... Kyle Lohse .... Corey Patterson†.....5 Chris Carpenter .... Kyle McClellan .....1 Jake Westbrook .....1 Jaime Garcia .....3 Matt Carpenter .....1 Shane Robinson .....2 Lance Lynn .... Octavio Dotel†.... Ryan Franklin .... Mitchell Boggs .... Eduardo Sanchez .... Jason Motte .... Fernando Salas .... Maikel Cleto .... Arthur Rhodes†.... Miguel Batista†.... Trever Miller†.... Brian Tallet†.... P.J. Walters†.... Marc Rzepczynski†.... Raul Valdes†.... Pittsburgh Pirates Derrek Lee†.... Jason Jaramillo .....2 Pedro Ciriaco .....2 Ryan Doumit .....7 Alex Presley .....3 Neil Walker .....5 Josh Harrison .....4 Chris Snyder .....8 Eric Fryer .... Jose Tabata .....3 Matt Diaz†.....5 Andrew McCutchen .....7 Xavier Paul†.....1 Matt Pagnozzi†.....1 Ronny Cedeno .....2 Garrett Jones .....8 John Bowker†.... Ryan Ludwick†.....2 Lyle Overbay†.....4 Michael McKenry .....3 Brandon Wood†.... Chase d'Arnaud .....2 Ross Ohlendorf .....2 Steve Pearce .....7 Pedro Alvarez .... Paul Maholm .... Dusty Brown .....4 Jeff Karstens .....1 James McDonald .....3 Kevin Correia .....2 Josh Rodriguez .....1 Brad Lincoln .... Charlie Morton .....3 Daniel McCutchen .....1 Jeff Locke .....1 Wyatt Toregas .....1 Brian Burres .... Jason Grilli .... Joe Beimel .... Tony Watson .... Chris Resop .... Garrett Olson .... Joel Hanrahan .... Jose Ascanio .... Evan Meek .... Daniel Moskos .... Tim Wood .... Chris Leroux .... Mike Crotta .... Jared Hughes .... Cincinnati Reds Dontrelle Willis .....8 Yonder Alonso .....9 Zack Cozart .....3 Joey Votto .....7 Brandon Phillips .....9 Homer Bailey .....3 Ramon Hernandez .....5 Chris Valaika .....1 Ryan Hanigan .....4 Miguel Cairo .....8 Juan Francisco .....2 Jay Bruce .....9 Chris Heisey .....5 Edgar Renteria .....4 Drew Stubbs .....9 Dave Sappelt .....2 Scott Rolen .....4 Todd Frazier .....5 Fred Lewis .....2 Paul Janish .....4 Jonny Gomes†.....3 Mike Leake .....2 Devin Mesoraco .....1 Edinson Volquez .....1 Jeremy Hermida†.....2 Sam LeCure .... Bronson Arroyo .....3 Travis Wood .....2 Johnny Cueto .....3 Matt Maloney .....1 Chad Reineke .... Carlos Fisher .... Jordan Smith .... Daryl Thompson .... Francisco Cordero .... Bill Bray .... Nick Masset .... Jared Burton .... Logan Ondrusek .... Aroldis Chapman .... Milwaukee Brewers Mike Rivera .... Logan Schafer .....1 Ryan Braun .....3 Nyjer Morgan .....3 Prince Fielder .....2 Corey Hart .....2 Jerry Hairston Jr.†.....6 Mark Kotsay .....2 Taylor Green .... Rickie Weeks .....3 Jonathan Lucroy .....2 George Kottaras .....6 Yuniesky Betancourt .....1 Marco Estrada .....2 Josh Wilson†.....2 Carlos Gomez .....4 Casey McGehee .....5 Yovani Gallardo .....6 Felipe Lopez†.....3 Craig Counsell .....6 Randy Wolf .... Chris Narveson .... Brandon Boggs .... Shaun Marcum .....4 Zack Greinke .....1 Wil Nieves .....6 Mat Gamel .....5 Erick Almonte .....4 Jeremy Reed .....2 Brett Carroll .....1 Sergio Mitre†.... Kamaron Loe .... Martin Maldonado .... Eric Farris .... Brandon Kintzler .... LaTroy Hawkins .... Takashi Saito .... Sean Green .... Frankie De La Cruz .... Mitch Stetter .... Mark DiFelice .... Tim Dillard .... Danny Herrera†.... John Axford .... Zach Braddock .... Mike Fiers .... Mike McClendon .... Chicago Cubs Carlos Zambrano .....8 Reed Johnson .....3 Starlin Castro .... Aramis Ramirez .....4 Bryan LaHair .....3 Darwin Barney .....2 Marlon Byrd .....6 Kosuke Fukudome†.... Jeff Baker .....2 Blake DeWitt .....5 Tony Campana .....3 DJ LeMahieu .....3 Steve Clevenger .....1 Alfonso Soriano .....7 Geovany Soto .....6 Carlos Pena .....3 Luis Montanez .....3 Koyie Hill .....6 Casey Coleman .....1 Welington Castillo .....1 Tyler Colvin .....7 Randy Wells .....1 James Russell .... Brad Snyder .....2 Matt Garza .....4 Ryan Dempster .....5 Rodrigo Lopez .....3 Doug Davis .....2 Jeff Samardzija .....1 Ramon Ortiz .....1 Jeff Stevens .... Justin Berg .... Marcos Mateo .... Andrew Cashner .... Kerry Wood .... John Grabow .... Sean Marshall .... Carlos Marmol .... John Gaub .... Rafael Dolis .... Scott Maine .... Chris Carpenter .... Excel Using the Homework 6 data set in Canvas and the Excel Homework 6 Tutorial or any other sources, answer all of the questions below. All pages of Homework 6 and Excel Homework 6 must be stapled together. Terms: Batting Average (BA) – number of hits/number of at-bats On Base Percentage (OBP) – number of times a player gets on base/number of plate appearances Problem: We would like to see if there is a relationship between the Batting Average (BA) of players in the National League Central with the On Base Percentage (OBP) of the players. If we conclude there is a relationship, then we can use Batting Average to predict On Base Percentage. Follow all the steps in the tutorial to make a scatterplot of Batting Average (BA) and On Base Percentage (OBP). BA will be your independent variable and OBP will be your dependent variable. You do not need to include the Scatterplot with your homework. 1. Write a short description of what kind of relationship, if any, you see in your scatterplot. (.1) Using Data Analysis in Excel, find the correlation coefficient for BA and OBP by creating a correlation matrix. (If Data Analysis is not loaded in your Excel, follow the instructions in Excel Tutorial 1 to install Data Analysis Toolpak.) This correlation matrix may not be able to be done on a Mac without using the webapp. You do not need to include the correlation matrix with your homework. 2. What is the value of the correlation coefficient found in your correlation matrix? (.. Using the value of the correlation coefficient found in Question 2, write a statement about the strength and direction of the data set. (.1) (Questions 1-3 can be completed after Lecture 27.) Use Data Analysis to run a regression of BA (independent variable) and OBP (dependent variable). (If Data Analysis is not loaded in your Excel, follow the instructions in Excel Tutorial 1 to install Data Analysis Toolpak.) This regression may not be able to be run on a Mac without using the webapp. You will be using this output for the remainder of the questions. You do not need to turn the output in with your homework. 4. Using the appropriate functions, find the sample standard deviation of both BA and OBP. You must handwrite or type the entire function equations (including the equals signs, the function names, and the arguments) and the answers. No credit will be given without the entire equations and the answers. (.1 for BA sample standard deviation, .1 for OBP sample standard deviation) 5. Using the value of r found in Question 2, hand calculate b1. You must show all work. (.. Write an interpretation of the slope beginning with the phrase “On average…â€. You must use the phrase on average and follow the pattern given in class in order to get credit. (.. How does the slope you calculated by hand in Question 5 compare to the slope found in the regression output? (.. Using the regression output, write the regression equation. .. What On Base Percentage would you predict if the Batting Average was .206? As always, you must show all work. (.. Is Batting Average a significant predictor of On Base Percentage? Why or why not? Alpha for this problem is .05. (.1 for answer, .1 for why) 11. What is the value of R-Square? (.. Write a statement to interpret the R-square value. (.1) SCM YOU MUST SHOW WORK TO GET CREDIT. 1. Fill in the following chart for the correlation coefficient. No credit will be given unless the entire chart is filled in. Name Symbol What it Tells Us Abs. or Rel.? (give units if absolute) Boundaries What Extremes Signify Population Correlation Coefficient Sample Correlation Coefficient r Strength + direction in sample None Relative measure -1 +1 Perfect rel. 2. Fill in the following chart for the coefficient of determination. No credit will be given unless the entire chart is filled in. Name Symbol What it Tells Us Abs. or Rel.? (give units if absolute) Boundaries What Extremes Signify Population Coefficient of Determination Sample Coefficient of Determination 3. Fill in the following chart for the standard error of the estimate. No credit will be given unless the entire chart is filled in. Name Symbol What it Measures Abs. or Rel.? (give units if absolute) Boundaries What Extremes Signify Standard Error of the Estimate 4. Fill in the following chart for the regression coefficients. No credit will be given unless the entire chart is filled in. Name Symbol What it Tells Us Abs. or Rel.? (give units if absolute) Boundaries Population Intercept Sample Intercept Population Slope Sample Slope 5. The standard deviation of advertising = $18.25. The standard deviation of sales = $2.57. The correlation is .79. If we are predicting advertising from sales, compute the regression coefficient. Using the phrase given in class (On average, …) and following the pattern given in class, interpret this slope. You must use the phrase on average and follow the pattern given in class to get any credit. You must use the phrase on average and follow the pattern given in class to get any credit . 6. List what we are concluding if we accept the null in a regression problem (p. 179 in the course packet). Then list what we are concluding if we reject the null in a regression problem. You must list all the statements presented in the entire slide (four for the null and five for the alternative) in class in order to get credit . 7. Write a scenario for a simple regression problem. You must include what the two variables are, stating which one is x and which one is y. Remember, y must be quantitative. Do not use the scenarios that are included in the course packet. If you base your idea off of another source, you must state the source.