FrontierMath
Algebra II/Linear Systems & Equations/Regression Models
Algebra II Regents in 15 days
Algebra II · Lesson 4

Regression Models

When data curves, a line won't fit — here's what to use instead.


In 2020, researchers tracking COVID-19 case counts noticed something alarming: the numbers weren't growing by a fixed amount each day, they were roughly doubling. A straight line fit to the first week of data predicted 500 cases by day 14. The actual count was over 8,000. The line wasn't wrong because of bad data — it was wrong because the wrong model was chosen. Picking the right regression model is the difference between a useful prediction and a dangerous one.

You already know linear regression: fit a line to data that grows at a roughly constant rate. But many real situations don't grow that way. Data that curves upward faster and faster often follows an exponential model. Data that rises, peaks, and falls often follows a quadratic model. Data that grows quickly at first and then slows down often follows a power model. Your graphing calculator can find the best-fit equation for each of these, just as it finds a line of best fit.

The three models you need to know are:

y = ab^x \quad \text
y = ax^2 + bx + c \quad \text
y = ax^b \quad \text

In every case, aa, bb, and cc are constants your calculator finds by minimizing the total error between the model and the actual data points. The process on the calculator is identical for all three: enter data into lists, run the regression, read the equation, check the correlation.

The correlation coefficient rr measures how well a linear model fits data, and r2r^2 (called the coefficient of determination) measures how well any regression model fits. An r2r^2 value of 1 means the model explains every bit of variation in the data perfectly. An r2r^2 value of 0.94 means the model accounts for 94% of the variation — the remaining 6% is scatter the model doesn't capture. Higher is better, but context matters: a model with r2=0.97r^2 = 0.97 that makes no physical sense is still a bad model.

Here is a data set showing a city's population (in thousands) over several decades:

| Year (since 1960) | Population (thousands) | |---|---| | 0 | 12 | | 10 | 19 | | 20 | 30 | | 30 | 47 | | 40 | 74 | | 50 | 116 |

The data grows faster and faster, which suggests an exponential model. Enter the years in L1 and the populations in L2. Run ExpReg on your calculator.

Fitting an exponential model to population data
y=abxy = ab^x
This is the form the calculator uses for exponential regression. You're looking for the values of a and b.
a12.03,b1.046a \approx 12.03, \quad b \approx 1.046
These are the values the calculator returns. The base b tells you the growth factor per year.
y=12.03(1.046)xy = 12.03(1.046)^x
Write the full equation by substituting the values back into the model form.
r20.9998r^2 \approx 0.9998
This is extremely close to 1, which means the exponential model fits this data almost perfectly.

The base 1.0461.046 means the population grows by about 4.6% per year. That interpretation comes directly from the model — it's not a separate calculation.

Now use the model to make a prediction. Estimate the population 60 years after 1960, meaning in 2020.

Using the model to predict population in 2020
x=60x = 60
60 years after 1960 is 2020. Plug that in for x.
y=12.03(1.046)60y = 12.03(1.046)^{60}
Substitute x = 60 into the regression equation.
y12.03(14.3)y \approx 12.03(14.3)
1.046 raised to the 60th power is approximately 14.3. Use your calculator for this.
y172 thousandy \approx 172 \text{ thousand} \checkmark
The model predicts about 172,000 people in 2020. This is an extrapolation — the prediction goes beyond the data we used to build the model.

Not every curved data set is exponential. Consider this data showing the height of a ball (in feet) at various times after being thrown:

| Time (seconds) | Height (feet) | |---|---| | 0 | 4 | | 1 | 28 | | 2 | 36 | | 3 | 28 | | 4 | 4 |

This data rises and then falls, which is the signature of a quadratic model. Run QuadReg on your calculator.

Fitting a quadratic model to height data
y=ax2+bx+cy = ax^2 + bx + c
Quadratic regression gives you three constants. The negative a value you'll find confirms the parabola opens downward.
a=8,b=32,c=4a = -8, \quad b = 32, \quad c = 4
These are exact values here because this data was generated from a perfect quadratic. In real data you'd get approximations.
y=8x2+32x+4y = -8x^2 + 32x + 4
Write the full equation. The negative leading coefficient confirms the ball goes up and comes back down.
r2=1.000r^2 = 1.000 \checkmark
A perfect fit, which makes sense since the data came from a quadratic relationship.

Here is a visualization of all three model types so you can see how their shapes differ:

Interactive graph — scroll to zoom, drag to pan

The red curve is exponential — it accelerates upward without bound. The blue curve is quadratic — it rises, peaks, and falls. The green curve is a power model — it grows quickly at first and then levels off relative to the exponential.

Choosing between models requires both visual judgment and r2r^2 comparison. If two models give similar r2r^2 values, consider the context. A population that's been growing steadily is more likely exponential than quadratic — quadratic growth implies the population will eventually start shrinking, which rarely happens on its own.

Practice Questions
y=abx,a=5.2,b=1.08,find y when x=10y = ab^x, \quad a = 5.2, \quad b = 1.08, \quad \text{find } y \text{ when } x = 10
Data: (1,3), (2,12), (3,27), (4,48) — identify the best model type\text{Data: } (1, 3),\ (2, 12),\ (3, 27),\ (4, 48) \text{ — identify the best model type}
r2=0.73 (exponential) vs. r2=0.98 (power) — which model fits better?r^2 = 0.73 \text{ (exponential) vs. } r^2 = 0.98 \text{ (power) — which model fits better?}
Regents Corner

On Part II and Part III of the Algebra II Regents, regression problems almost always ask you to write the equation, use it to make a prediction, and state what a coefficient means in context. Leaving out any of these three parts costs points. A complete answer names the regression type, writes the full equation with rounded coefficients, performs the prediction calculation, and interprets the result in the units of the original problem.

Students run a regression, get an r-squared value close to 1, and assume the model is correct without checking whether the model type makes sense. A quadratic regression on population data might give r-squared of 0.99 over a 30-year window — but the quadratic model predicts the population will eventually turn negative, which is impossible. Always ask whether the model's behavior beyond the data makes physical sense, not just whether it fits the data you have.
← Previous
Rational Exponents