In this post we look at the total number of confirmed cases of COVID-19 in the USA.
This week the number of confirmed cases was more than 1.2 million two days ago.
The data has not been checked. There is a suspiciously high value on 26 April. This value appeared in two data sets from two sources.
First look at the bar chart:
Also look at the moving averages:
It would appear that little can be gained from the above graph.
However the red curve may be close to linear.
This suggests that linear regression may produce a suitable straight line for estimating the total number of cases later.
We warn that extrapolation (estimation of future values) is very unreliable.
We use Excel to generate a straight line (least squares regression line) estimated from 5 April (Day 76) to 6 May (Day 107).
The straight line obtained has the formula y=mx+c where
m ~ -158.6 and
c ~ 43,467.4
i.e. y = -158x + 43467.4
This crosses the x-axis at x = 274 (Day 274 = 20 October).
This line is represented by adding red bars to the above bar chart:
We can use the area of the triangle eventually generated from the red bars to estimate the total number of confirmed cases.
The area of this triangle is
.5*(274 – 76)*31412.1
i.e. half of the difference in the number of days times by the straight-line estimate for 5 April (Day 76).
This gives us
We need to add on the 312,237 existing cases on 5 April to get 3,422,519
which is the estimated total number of confirmed COVID-19 cases.
We also note that on 23 April we have a low point on the bar graph of 17,588 confirmed cases.
The cumulative number of cases for 23 April is 842,629.
Should this point end up being the midpoint then the total number of confirmed cases would be 2 x 842,629 = 1,685,258.
We need to add on 10% to allow for a long tail giving 1,853,784 cases.
We now have a range of 1.85 million to 3.75 million confirmed COVID-19 cases in the USA for this outbreak.
You may need to add on an extra 5% to the range to include probable cases.
A number of institutions believe the USA outbreak will end early in August.
A total of 3,001,000 cases is estimated to be reached by the above results on 9 August.
However if 9 August is near the end of the outbreak (excluding a long low tail) we want the number of new cases close to zero (not 11,400).
Assuming zero new cases on 9 August, using the triangle area as previously, we have 26,494.8 cases on 6 May and 95 days from 6 May to 9 August.
This gives a total of 1,258,502.8 cases (.5 x 95 x 26,494.8) from 6 May to 9 August.
We need to add on the estimated 1,207,334.9 total number of cases up to 6 May.
This gives a total of 2,465,837.7 cases.
Once a long tail is taken into consideration (10% more), we get 2,712,421.5 cases which we round down to 2.7 million cases.
In this scenario, we estimate the total number of cases to be in the range 2.45 – 2.7 million cases.
My other COVID-19 posts can be found here:
Data for my posts can be found at:
I share my posts at: