Final Individual Homework Assignment
Instructions: The following 10 activities relate to concepts, tools, and applications in Data Analysis (EDA), Data Visualization and Data Storytelling. This assignment is worth 60% of your grade.
It is due on Feb 24, 2022 11:59 PM.
Provide answers to all 10 activities in one single PDF document. Include your name, date (no cover page necessary) and any references used. Do not include the questions in your document, just the answers are fine. For one of the questions, a tableau file is required for submission. Do not Zip these files together, as the PDF component will need to go through a plagiarism check.
Page limit: 10 pages MAX.
Activity 1: Storyboard: Technary Case Study and Data Analysis
You are a statistician for a large manufacturing company, Technary. The company produces a product, product X. Many other competitors also produce product X. You are presenting to a group of statisticians and mathematicians at your company. The goal of your project is to present ways your company can gain a competitive edge in this industry as it pertains to the production of product X. Assume the audience has a HIGH level of knowledge in statistics and reading/interpreting advanced plots.
For the following case, create six slides: 1 title slide, 3 conflict slides, 1 big idea slide and 1 resolution slide (landing page) with three subsections. Do not put all 6 slides on one page if the text and/or visualizations are not legible. Here are the requirements:
Slide 1) Use an effective and attention-grabbing title for the title slide.
Slide 2) Situation 1: Use tab 1 (Technary Part A) in the excel. The ideal weight of product X is 450 units across the industry. This means consumers of product X prefer this product to be exactly 450 units no matter which company they buy it from. However, the companies sell varying weights of this product because of the inefficiencies of machines that produce this product. In Tab 1, you have sample weights of 1000 products produced at Technary and 4 different competitors in the span of one hour (assume you stood at the production line at each of these 5 companies, measuring/recording the weight of 1000 products). You want to compare the distribution of weights across these 5 companies to assess the quality of machines at each company. More specifically, you want to comment on the precision and accuracy of your machines relative to other companies. Use the correct statistical plot. Use headings/subheadings and explanatory text in your slide. Accuracy refers to how close a measurement is to the true or accepted value. Precision refers to how close measurements of the same item are to each other. After you are done this slide, use your analysis and findings to build your big idea and support your resolution.
Slide 3) Situation 2: There are people needed on the assembly line in certain areas of the production of product X. The speed at which these employees need to work depends on the speed of the belt. When there is a high demand for Product X, the speed of the belt is increased. If a semi-finished product waits too long on the assembly line (if it is not addressed by an employee), it can get spoiled and needs to be thrown out. Normally, the company is okay with some waste. In tab 2 (Technary Part B), you have the number of waste products per week in a year. It is considered normal waste if the waste in a week is within 2 standard deviations of the mean. More or less than this number means the process is out of control. Comment on whether or not the level of waste is ever out of control. Use the appropriate statistical plot. Follow the same instructions as you did for slide 1. Use your analysis and findings to build your big idea and support your resolution.
Slide 4) Situation 3: A special chemical called Elerium-128 gets used at every step of the production process of Product X. All companies must use Elerium-128 in their production of Product X. You believe that Elerium-128 is not being used efficiently in all the processes and compared to other companies, your usage of Elerium-128 is off. In tab 3 (Technary Part C), you can see the percentage of total Elerium-128 being used at various stages of the production process at Technary and a few competitors. Show the correct plot and follow the instructions in slides 2 and 3.
Slide 5) The big idea goes here. It must be negatively framed.
Slide 6) Creating a landing page with resolutions that hopefully address the conflict introduced in slide 2, 3 and 4. Tie it back to the business question and big idea. In the slides that would come after this resolution landing slide (not required for this activity), you would, ideally, explore these ideas further, in a Minto Pyramid style format. But these further resolution steps are not required.
This purpose of this exercise is to show how data analysis and data visualization can be used in the earlier parts of your data presentation (setting, characters, conflict).
Activity 2: Build the appropriate graph: Life Expectancy vs Income
For this activity, you will submit one slide using the data in the tab Activity 2 Income v Life Expec.
You are tasked to create the appropriate visualization to compare the relationship between Income (GDP per capita) and Life expectancy (years) of various countries across two time points (1800 and 2015). You will need to show both of these time points. Furthermore, you will also need to encode a third variable, population size, and how that has changed for each country across the two time periods.
Create a slide with your visualization.
- Use an appropriate headline to highlight the main relationship.
- Use an appropriate subheading to comment on the below:
- Highlight the country that had the largest increase in life expectancy from 1800 to 2015.
- Highlight the country with the largest population in 2015.
- Highlight the country or countries that showed a decrease in income from 1800 to 2015.
Use explanatory text if appropriate.
This is an advanced graph. A combination of two graphs. Think about how many variables you are asked to represent. Think about the two graphs I discussed in class: one to highlight the association of three quantitative variables and the second to highlight differences between two time periods or a before-and-after. You do not have to use all the countries in the dataset. Focus on a few interesting ones and use contrast to highlight the ones you like. But make sure to include the answers to above.
Activity 3: Declutter + Gestalt Principle (Connection): Power outage Hurricane Irma/Wilma
Redraw the messy Hurricane Irma/Wilma graph below to improve it. Use the data in the tab Activity 3: Hurricane Irma Wilma. Think about choosing the best graphical form and maximizing the data–ink ratio. You must use Gestalt Principle of connection (effectively) in your revised visualization. Use the information provided about the audience and the context of the discussion to make choices that focus the audience on the key comparison. For this activity, submit only one slide with the visualization and any other detail you want to include.
Power outages following Hurricanes Wilma and Irma
|Context||An introductory slide as part of a longer presentation|
|Audience||Community leaders seeking an update on how well Florida recovered from Hurricane Wilma as compared to Irma|
|Communicator||A trusted representative from the US Energy Information Administration, which measures issues related to US energy infrastructure|
|Goal||Indicate that Hurricane Irma left more customers without power than Hurricane Wilma, but recovery from hurricane Irma has been faster|