import React, { useRef, useState, useEffect } from "react"; import { ArrowDown, Check, BookOpen, Target, BarChart3, Calculator, Layers, TrendingUp, } from "lucide-react"; import FrequencyMeanWidget from "../../../components/lessons/FrequencyMeanWidget"; import HistogramBuilderWidget from "../../../components/lessons/HistogramBuilderWidget"; import BoxPlotAnatomyWidget from "../../../components/lessons/BoxPlotAnatomyWidget"; import BoxPlotComparisonWidget from "../../../components/lessons/BoxPlotComparisonWidget"; import DataModifierWidget from "../../../components/lessons/DataModifierWidget"; import Quiz from "../../../components/lessons/Quiz"; import { DATA_REP_QUIZ_DATA, CENTER_SPREAD_QUIZ_DATA, DISTRIBUTIONS_QUIZ_DATA, } from "../../../utils/constants"; import { Frac } from "../../../components/Math"; interface LessonProps { onFinish?: () => void; } const DataRepresentationLesson: React.FC = ({ onFinish }) => { const [activeSection, setActiveSection] = useState(0); const sectionsRef = useRef<(HTMLElement | null)[]>([]); const scrollToSection = (index: number) => { setActiveSection(index); sectionsRef.current[index]?.scrollIntoView({ behavior: "smooth", block: "start", }); }; useEffect(() => { const observer = new IntersectionObserver( (entries) => { entries.forEach((entry) => { if (entry.isIntersecting) { const index = sectionsRef.current.indexOf( entry.target as HTMLElement, ); if (index !== -1) { setActiveSection(index); } } }); }, { rootMargin: "-20% 0px -60% 0px" }, ); sectionsRef.current.forEach((section) => { if (section) observer.observe(section); }); return () => observer.disconnect(); }, []); const SectionMarker = ({ index, title, icon: Icon, }: { index: number; title: string; icon: any; }) => { const isActive = activeSection === index; const isPast = activeSection > index; return ( ); }; const allQuizzes = [ ...DATA_REP_QUIZ_DATA, ...CENTER_SPREAD_QUIZ_DATA, ...DISTRIBUTIONS_QUIZ_DATA, ]; return (
{/* Section 1: Frequency & Mean */}
{ sectionsRef.current[0] = el; }} className="min-h-screen flex flex-col justify-center mb-24 pt-20 lg:pt-0" >

Frequency Tables & Weighted Mean

When calculating the mean from a table, you must use a{" "} weighted mean. Simply adding the values in the first column is a common trap! You must multiply each value by its frequency first.

The Weighted Mean Formula

Weighted Mean ={" "}

For each row, multiply the value by its frequency count. Sum all those products, then divide by the total number of data points (the sum of all frequencies).

Common SAT Trap: Do NOT average the values in the first column directly. That ignores how many times each value appears and will almost always give the wrong answer.

{/* Section 2: Histograms */}
{ sectionsRef.current[1] = el; }} className="min-h-screen flex flex-col justify-center mb-24" >

Histograms

Histograms group data into bins (intervals). Each bar covers a range of values, and the height of the bar tells you how many data points fall in that range. All bins have equal width on the SAT.

Key Histogram Concepts

Frequency (Count)

The raw number of data points in a bin. Read directly from the y-axis when it is labeled "Frequency" or "Count".

Relative Frequency (Percent)

Each bin's count divided by the total number of data points. Formula:{" "} Relative Frequency = .

SAT Trick: Count vs. Percent Switch

The SAT frequently presents a histogram in one form (frequency) and asks a question that requires the other (relative frequency). Always check the y-axis label carefully.

{/* Section 3: Box Plots */}
{ sectionsRef.current[2] = el; }} className="min-h-screen flex flex-col justify-center mb-24" >

Anatomy of a Box Plot

A Box Plot visualizes the 5-Number Summary: Min, Q1, Median, Q3, Max. The box itself represents the{" "} IQR (Interquartile Range), which contains the middle 50% of the data.

The 5-Number Summary

{[ { label: "Min", desc: "Smallest value (left whisker)" }, { label: "Q1", desc: "25th percentile (left edge of box)" }, { label: "Median", desc: "50th percentile (line inside box)" }, { label: "Q3", desc: "75th percentile (right edge of box)" }, { label: "Max", desc: "Largest value (right whisker)" }, ].map((item) => (

{item.label}

{item.desc}

))}

Interquartile Range (IQR)

The IQR measures the spread of the middle 50% of the data.

IQR = Q3 − Q1

Outlier Detection Rule

A data point is an outlier if it falls outside these boundaries:

Lower Bound: Q1 − 1.5 × IQR

Upper Bound: Q3 + 1.5 × IQR

{/* Section 4: Center & Spread (from CenterSpreadLesson) */}
{ sectionsRef.current[3] = el; }} className="min-h-screen flex flex-col justify-center mb-24" >

Measures of Center & Spread

The SAT always tests your ability to interpret center (where data clusters) and spread (how far data varies). Understanding what each measure is resistant or sensitive to is critical.

Measures of Center

Mean (Average)

Add all values, divide by how many there are. Sensitive to outliers — one extreme value pulls the mean.

Median

Middle value (sorted)

Sort the data. Middle value if odd count; average of middle two if even count. Resistant to outliers.

Mode

Most frequent value

The value that appears most often. A dataset can have no mode, one mode, or multiple modes. Rare on SAT.

Worked Example: Mean vs. Median with an Outlier

Dataset: {"{"}5, 7, 8, 9, 10, 11, 95{"}"}

Sum = 5 + 7 + 8 + 9 + 10 + 11 + 95 = 145

Mean = ≈{" "} 20.7 ← pulled toward 95

Median = 9 ← middle value, not affected by 95

Measures of Spread

Measure Formula Sensitive to Outliers? Best Used With
Range Max − Min Yes — very sensitive Simple comparisons
IQR Q3 − Q1 No — resistant Median; skewed data
Standard Deviation Avg distance from mean Yes — sensitive Mean; symmetric data

Skew Direction → Mean vs. Median

Right-Skewed (Positive Skew)

A long tail extends to the right. Mean is pulled right (larger) by high outliers.

Mean > Median > Mode

Left-Skewed (Negative Skew)

A long tail extends to the left. Mean is pulled left (smaller) by low outliers.

Mean < Median < Mode

{/* Section 5: Effects of Data Changes (from CenterSpreadLesson) */}
{ sectionsRef.current[4] = el; }} className="min-h-screen flex flex-col justify-center mb-24" >

Effects of Data Changes

The SAT loves questions that ask what happens to mean, median, and standard deviation after modifying data — without making you recalculate everything. Memorize these rules.

The Rules Table

Operation Mean Median Std Dev / Range
Add constant k to every value Mean + k Median + k No change
Multiply every value by k Mean × k Median × k × k (scales too)
Add a high outlier Increases Barely changes Increases
Remove a high outlier Decreases (toward center) Barely changes Decreases

Why Adding k Doesn't Change Spread

Standard deviation measures how far values are from the mean. If you add k to every value, the mean also shifts by k — so every distance from the mean stays the same.

Dataset: {"{"}2, 4, 6{"}"} → Mean = 4, SD ≈ 1.63

Add 10: {"{"}12, 14, 16{"}"} → Mean = 14, SD ≈ 1.63 (unchanged)

Multiply by 2: {"{"}4, 8, 12{"}"} → Mean = 8, SD ≈ 3.27 (doubled)

Worked Example 1: What happens when an outlier is removed?

Dataset: {"{"}2, 3, 4, 5, 6, 50{"}"}. Mean ≈ 11.7, Median = 4.5

Remove 50: {"{"}2, 3, 4, 5, 6{"}"}. Mean = 4, Median = 4

Mean decreased significantly. Median barely changed.

Worked Example 2: Scores shifted

Teacher adds 5 bonus points to every student's score.

Class mean was 72 → new mean ={" "} 77

Class median was 74 → new median ={" "} 79

Standard deviation:{" "} no change (spread unchanged)

{/* Section 6: Comparing Distributions */}
{ sectionsRef.current[5] = el; }} className="min-h-screen flex flex-col justify-center mb-24" >

Comparing Distributions

When comparing two datasets, always address two dimensions:{" "} Center and Spread. The SAT often asks you to compare groups using both.

What to Compare

Center

  • Median — use when data is skewed or has outliers; it is resistant to extreme values.
  • Mean — use when data is roughly symmetric; it accounts for all values but is pulled by outliers.

Spread

  • IQR — measures spread of the middle 50%; resistant to outliers. Preferred with median.
  • Standard Deviation (SD) — measures average distance from the mean; sensitive to outliers. Preferred with mean.

SAT Language to Watch For

"Which group has greater variability?" → Compare IQR or SD (larger value = more spread out).
"Which group has a higher typical value?" → Compare medians or means.
"Are the distributions similar in shape?" → Look at whether both are symmetric, skewed left, or skewed right.

{/* Section 7: Quiz */}
{ sectionsRef.current[6] = el; }} className="min-h-screen flex flex-col justify-center" >

Practice Time

{allQuizzes.map((quiz, idx) => (
))}

Topic Mastered!

); }; export default DataRepresentationLesson;