Skip to content

Instantly share code, notes, and snippets.

@caspercasanova
Created September 28, 2024 04:44
Show Gist options
  • Save caspercasanova/6670564435d1265da0bb9f03d555c12b to your computer and use it in GitHub Desktop.
Save caspercasanova/6670564435d1265da0bb9f03d555c12b to your computer and use it in GitHub Desktop.

Topic: Flint Water Crisis

(Confidence Interval from One-sample Proportion)

  1. List at least 5 ways that you used your home’s water this week.

    • Laundry
    • Dishwater
    • Coffee
    • Sink
    • Sprinklers
  2. An important EPA regulation is the Lead and Copper Rule: No more than 10% of households can have prominent lead levels (defined as >15 parts per billion) in their water. Lead contamination can cause many health issues. Imagine your water failed this rule, and the issue wouldn’t be fixed for several years. Discuss how would your life change in some ways and the health risks that will impact you/your family?

    • Not only would I probably be forced to relocate, but I'd be buying bottled water and be forced to buy different filtering mechanisms until a new living situation was found. Not only that but I'd have to be buying test lead tests and if my pipes were contaminated then I'd need to replace piping. If I lived in a house I'd have to sell that house and selling would be difficult because no buyers want to buy a home with contaminated water.
  3. Sampling Flint Homes:

Address of Yes Homes Address of No Homes
Address: 724 E KEARSLEY ST Address: 4321 M L KING AVE
Address: 2420 WINONA ST Address: 3321 HAWTHORNE DR
Address: 2414 FLUSHING RD Address: 3757 WORCHESTER DR
Address: 1429 LINCOLN DR Address: 638 W LORADO AVE
Address: 2024 CHURCH ST Address: 3525 LEITH ST
Address: 2527 BROWNELL BLVD Address: 2110 STANFORD AVE
Address: 1714 MISSOURI AVE Address: 1116 W MOORE ST
Address: 132 W JAMIESON ST Address: 2830 YALE ST
Address: 2500 N VERNON AVE Address: 202 W ALMA AVE
Address: 3409 CHURCHILL AVE Address: 1615 CROMWELL AVE
Address: 3309 CLAIRMONT ST
Address: 719 DELL AVE
Address: 3152 RISEDORPH AVE
Address: 1818 W HOBSON AVE
Address: 1414 RASPBERRY LN
Sampled Homes Results Value
Number of homes with prominent lead (Yes) 10
Number of homes without prominent lead (No) 15
Total number of homes sampled (sample size) $n = 25$
Sample proportion ($\hat{p}$) of homes with prominent lead levels (Round to 3 decimals) $\hat{p} = .400$

Does your sample proportion suggest that the city may be violating the EPA regulation. Does it prove that the regulation has been violated? Discuss/Explain why or why not.

  • I would say that the sample size is definitely indicative of something afoot. From the sampled homes alone, I would say yes the regulation was violated. The sample proportion indicates something much higher than 10%
  1. Using your sampled results, we will now determine an interval of plausible values for the true proportion of homes with prominent lead levels.

    1. Who/What is the population of interest in this study? The homes in Flint Michigan.

    2. Calculate and describe the sampling distribution for the sample proportions.** (Round to 3 decimals.)

      • Shape: Because $np = (25)(0.4) \geq 10$ & $n(1-p) = 15$ the shape of the sampling distribution of $\hat{p}$ is approximately normal.
      • Measure of Center: $p = .4$
      • Measure of Spread: $\sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.4(1-0.4)}{25}} = .098$
    3. Calculate a 95% confidence interval for the true proportion of Flint’s homes that have prominent lead levels. Round to 3 decimals.

      • $(.208, .592)$
    4. Use your interval to evaluate whether Flint’s water system violates the EPA regulation (no more than 10% of a city’s homes can have prominent lead levels). Clearly explain your reasoning using statistical evidence.

      • I am 95% confident that the true population 0.400 of homes in Flint Michigan with lead is between 20.8% and 59.2% which breaks the EPA's rule of having no more than 10% of homes.
    5. Using your sample data, calculate the sample size needed if you wanted to be 95% confident and have no more than 1% margin of error

      • $n = \frac{Z^2 \cdot p \cdot (1-p)}{E^2} = \frac{1.96^2 \cdot 0.4 \cdot (1-0.4)}{0.01^2} = 9219.84$
    6. In reality, before conducting their study, Dr. Edwards and his team had no idea what proportion of homes would have lead! If they want to guarantee that their 95% interval’s margin of error is no more than 3%, what sample size do they need?

      • $n = \frac{Z^2 \cdot p \cdot (1-p)}{E^2} = \frac{1.96^2 \cdot 0.4 \cdot (1-0.4)}{0.03^2} = 1024.42$
  2. In the actual Virginia Tech study, researchersrandomly sampled water from 252 homes in Flint. Of those, 42 had high lead content. Find a 95% confidence interval using this sample (assume all conditions are met). Is this interval wider or narrower than the one you got with your sampled data? Explain/Discuss why do you think this is?

    • $n = 252$
    • $p = .167$
    • $\sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.167(1-0.167)}{252}} = .023$
    • 1propZInt = (.121, .213)
    • When the sample size is increased, the curve / interval become narrower?.
  3. Due to the confidence interval produced by Dr. Edwards and the Virginia Tech research team, the City of Flint finally acknowledged problems with the water. This began a multi-year effort to correct the city’s water system. In particular, Flint has undergone a massive digging project, replacing lead or galvanized pipes with safer materials (e.g., copper). Choose one of the sampled homes that had contamination and search the address in this current database. Have the lead or galvanized pipes been replaced in the years since the crisis? If so, when? Copy/Paste the textbox from the website that has this information showing the address you’ve selected.

    Seems like they have.

Address: 724 E KEARSLEY ST

The current water service line is made of **COPPER**  
The old line was replaced on June 14, 2021.  
The old water service line was LEAD on the public side and GALVANIZED on the private side.  
For more information, click [here](https://flintpipemap.org/info#q9)

Making Errors

Scenario: The Kendall Footwear (a shoe company in Michigan) disposed of chemicals (PFAS) which have leaked into the groundwater. The state’s drinking water limit of 70 parts per trillion (ppt) is considered safe, while anything above 70 is considered dangerous. Officials believe the water in nearby towns may also be unsafe. They take a random sample of 200 households from nearby towns. They find the average lead level of the sample is 70.5 ppt and they perform a significance test at the 5% level of significance.

  1. State the appropriate null/alternative hypotheses for performing a significance test using words and symbols.
Null The average PPT is within safe levels i.e. the sample mean <= 70PPT
Alternative The average PPT is not within safe levels i.e. the sample mean > 70PPT
  1. After conducting a significance test, a P-value of 0.045 is found. Discuss/explain this value in this context
    • There is a 4.5 % the Null Hypothesis is false, or a 4.5% chance that the water is actually less than 70ppt.
  2. Based on the P-value, should the town keep the current water or switch to bottled water? Discuss/Explain reasons.
    • Since the P value is below the significance level we would reject the null hypothesis we assume the water is not safe to drink and should switch.
  3. Let’s suppose the decision is wrong. Discuss/explain what type of testing error is made and what would be a consequence of this error?
    • If we reject the null hypothesis, then we will make a type 1 error and people will drink contaminated water.
  4. How often would this error occur?
    • It would occur 4.5% of the time.
  5. Now suppose the P-value is 0.14. Should the town keep the current water or switch to bottled water? Discuss/Explain reasons.
    • Because the P value is above the confidence level, we fail to reject the null hypothesis and the town would continue with the current water.
  6. Let’s suppose this decision is wrong. Discuss/explain what type of testing error is made and what would be a consequence of this error?
    • This would result in buying bottled water when the water was acutally fine
  7. Discuss/explain which type of error is more serious and why?
    • A type II error is more costly, because it would actually effect peoples lives were as a type I error would just mean more costs involved.
  8. The chemical engineers who are performing the tests have estimated the power of the hypothesis test to be 0.667. Discuss/Explain the meaning of this value (power = 0.667) in the context of this scenario.
    • The power is the chance of rejecting the null hypothesis (the water is safe) when it is actually false or the power probability of not committing a type II error.
  9. What are 2 ways the engineers could increase the power of the hypothesis test? Discuss what considerations (pros and cons) need to be made in each attempt to increase the power?
    • Increasing the sample size and increasing the significance level. Both would require better more accurate testings and more effort to get a larger sample size and most likely more money.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment