c:\Users\samir.orucov\Downloads\style (1).css The Nature of Statistics - Interactive Slides
Timer: 00:00

The Nature of Statistics

An Introduction to Statistical Thinking

Dr. Samir Orujov

ADA University - School of Business

Fall 2025

Learning Objectives - Part 1

๐Ÿ“Š

Statistics Basics

Define statistics and distinguish between descriptive and inferential statistics

๐Ÿ‘ฅ

Populations & Samples

Understand the relationship between populations, samples, and statistical inference

๐Ÿ”ฌ

Study Types

Classify studies as descriptive vs. inferential and observational vs. experimental

Learning Objectives - Part 2

๐ŸŽฏ

Sampling Methods

Master simple random, systematic, cluster, and stratified sampling techniques

๐Ÿงช

Experimental Design

Apply principles of control, randomization, and replication in experiments

๐Ÿ’ก

Statistical Reasoning

Develop critical thinking skills for analyzing statistical information

What is Statistics?

Two Common Definitions

Definition 1 (Plural): Facts or data, either numerical or nonnumerical, organized and summarized to provide useful information about a particular subject.

Definition 2 (Singular): The science of organizing and summarizing numerical or nonnumerical information.

Examples in Daily Life

  • Unemployment figures
  • Sports statistics
  • Election polls
  • Medical research data

Development of Statistics

Ancient Times
Roman censuses, birth/death records
1857-1936
Karl Pearson - Mathematical foundations
1890-1962
Ronald Fisher - Modern experimental design
Today
Applied across all fields of study

Descriptive Statistics

Definition

Descriptive statistics consists of methods for organizing and summarizing information.

The 1948 Baseball Season Example

The Washington Senators played 153 games, winning 56 and losing 97. They finished seventh in the American League and were led in hitting by Bud Stewart (.279 average).

Key Point: This summarizes what actually happened - no predictions or inferences made.

Inferential Statistics

Definition

Inferential statistics consists of methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample.

The 1948 Presidential Election

The Gallup Poll predicted Truman would win only 44.5% of the vote and lose to Thomas Dewey. However, Truman actually won more than 49% and became president.

Key Point: Using sample data to make predictions about entire populations.

Population vs. Sample

Population

The collection of ALL individuals or items under consideration

โ†’

Sample

That part of the population from which information is obtained

Example Calculator

Sample represents 1.0% of population

Observational Studies vs. Designed Experiments

Observational Study

Researchers simply observe and take measurements

Data already exists

Can reveal: Association

Cannot establish: Causation

Designed Experiment

Researchers impose treatments and controls

Data created through intervention

Can reveal: Association

Can help establish: Causation

Example: Vasectomies and Prostate Cancer

The Study

Researchers found 113 cases of prostate cancer among 22,000 men who had a vasectomy, compared to 70 cases per 22,000 among men who didn't have a vasectomy.

Result: About 60% elevated risk of prostate cancer for men with vasectomies.

Analysis

Type: Observational Study

What it shows: Association between vasectomies and prostate cancer

What it cannot prove: That vasectomies cause prostate cancer

Why not: Other factors might influence both the decision to have a vasectomy and prostate cancer risk

Example: Folic Acid and Birth Defects

The Study

4,753 women were randomly divided into two groups before conception. One group took daily multivitamins containing 0.8 mg of folic acid, the other received only trace elements.

Result: Major birth defects occurred in 13 per 1000 women taking folic acid vs. 23 per 1000 in the control group.

Analysis

Type: Designed Experiment

What it shows: Association between folic acid and reduced birth defects

What it can suggest: Folic acid may cause reduction in birth defects

Why stronger evidence: Random assignment controlled for other factors

Simple Random Sampling

Definition

A simple random sample of size n from a population is a sample obtained in such a way that every collection of n members of the population has an equal chance of being selected.

How to Obtain a Simple Random Sample

  1. Number all members of the population
  2. Use a random number table or generator
  3. Select the required sample size

Using Random Number Tables

Example: Sample 5 from 50 students

Random numbers: 19223, 95034, 05756, 28713, 96409

Selected numbers: 19, 34, 05, 28, 09

Selected students: #5, #9, #19, #28, #34

Interactive Sampler

Click to generate random sample

Systematic Random Sampling

Procedure

  1. Divide population size by sample size: m = โŒŠN/nโŒ‹
  2. Randomly select a number k between 1 and m
  3. Select members numbered k, k+m, k+2m, ...

Example: Sample 15 from 728 students

Step 1: m = โŒŠ728/15โŒ‹ = 48

Step 2: Randomly select k = 22

Step 3: Select students: 22, 70, 118, 166, 214, 262, ...

Cluster Sampling

Procedure

  1. Divide population into groups (clusters)
  2. Obtain simple random sample of clusters
  3. Use ALL members from selected clusters

Example: Bike Path Survey in Tempe

City divided into 947 blocks (clusters), each with 20 homes. To sample 300 homes:

  • Need 15 clusters (300 รท 20 = 15)
  • Randomly select 15 of the 947 blocks
  • Survey ALL homes in selected blocks

Advantage: Reduced travel time for interviewers

Stratified Sampling

Procedure (Proportional Allocation)

  1. Divide population into subpopulations (strata)
  2. From each stratum, sample proportionally:
    Stratum sample size = (Total sample size) ร— (Stratum size) / (Population size)
  3. Combine all stratum samples

Stratified Sampling Calculator

Enter values and calculate

Sampling Methods Comparison

Method How It Works Advantages Disadvantages
Simple Random Every sample equally likely Unbiased, straightforward May be impractical for large populations
Systematic Select every kth member Easy to implement Cyclical patterns can bias results
Cluster Sample entire clusters Cost-effective for scattered populations Clusters may not represent population
Stratified Sample from each stratum Ensures representation of subgroups Requires prior knowledge of strata

Principles of Experimental Design

๐ŸŽฏ

Control

Two or more treatments should be compared to isolate the effect of the variable of interest.

๐ŸŽฒ

Randomization

Experimental units should be randomly assigned to treatments to avoid selection bias.

๐Ÿ”„

Replication

Use sufficient experimental units to ensure reliable results and detect treatment differences.

Experimental Design Terminology

Experimental Unit

The individual or item on which the experiment is performed

Response Variable

The characteristic measured or observed as the outcome

Factor

A variable whose effect on the response variable is of interest

Levels

The possible values of a factor

Treatment

Each experimental condition (combination of factor levels)

Treatment Group

The group receiving the experimental treatment

Control Group

The group receiving a placebo or standard treatment

Completely Randomized Design

Definition

In a completely randomized design, all experimental units are assigned randomly among all treatments.

Example: Golf Ball Driving Distance

40 Golfers

๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค๐Ÿ‘ค
โ†“ Random Assignment
Brand A
8 golfers
Brand B
8 golfers
Brand C
8 golfers
Brand D
8 golfers
Brand E
8 golfers

Randomized Block Design

Definition

In a randomized block design, experimental units are assigned randomly among treatments separately within each block.

Example: Golf Ball Distance by Gender

Block 1: Men (20 golfers)

A
4
B
4
C
4
D
4
E
4

Block 2: Women (20 golfers)

A
4
B
4
C
4
D
4
E
4

Advantage: Controls for gender differences that might affect driving distance

Practice Activity

Statistical Study Classification - 3 minutes

Scenario: A researcher wants to test whether a new teaching method improves student performance. She randomly assigns 200 students to either the new method or traditional method, then compares their test scores.

Questions to consider:

  1. Is this descriptive or inferential?
  2. Is this observational or experimental?
  3. What is the population?
  4. What is the sample?
  5. What are the treatments?

Case Study: AFI Top Films of All Time

The Study

The American Film Institute (AFI) conducted a poll of 1,500 film artists, critics, and historians, asking them to pick their 100 favorite films from a list of 400 films made between 1915 and 2005.

Top 5 Results

  1. Citizen Kane (1941)
  2. The Godfather (1972)
  3. Casablanca (1942)
  4. Raging Bull (1980)
  5. Singin' in the Rain (1952)

Analysis Questions

1. What is the population?

All film artists, critics, and historians who could potentially judge films

2. What is the sample?

The 1,500 individuals who participated in the poll

3. Is the study descriptive or inferential?

Could be either, depending on the statement made about the results

Quiz Question 1

Descriptive vs. Inferential Statistics

A researcher surveys 500 college students about their study habits and reports that 65% of surveyed students study more than 2 hours per day. This is an example of:

Quiz Question 2

Sampling Methods

A pollster wants to survey voters in a city. She divides the city into neighborhoods, randomly selects 10 neighborhoods, and surveys ALL voters in those selected neighborhoods. This is:

Quiz Question 3

Experimental Design

Which of the following is NOT one of the three basic principles of experimental design?

Key Concepts Review

Types of Statistics

Descriptive: Organizing and summarizing data
Inferential: Drawing conclusions about populations from samples

Study Types

Observational: Can show association only
Experimental: Can help establish causation

Sampling Methods

Simple Random: Every sample equally likely
Systematic: Every kth member selected
Cluster: Random selection of entire groups
Stratified: Sample from each subgroup

Statistical Decision Making Framework

1. Define the Problem

What question are you trying to answer?

โ†“

2. Determine Study Type

Observational
Observe existing data
Experimental
Manipulate conditions
โ†“

3. Choose Sampling Method

Simple Random
Systematic
Cluster
Stratified
โ†“

4. Collect and Analyze Data

Apply appropriate statistical methods

โ†“

5. Draw Conclusions

Interpret results within the context of the study limitations

Common Statistical Mistakes to Avoid

โš ๏ธ

Confusing Association with Causation

Just because two variables are related doesn't mean one causes the other.

Example: Ice cream sales and drowning deaths both increase in summer, but ice cream doesn't cause drowning.

๐ŸŽฏ

Sampling Bias

When the sample doesn't represent the population of interest.

Example: Surveying only people with landlines in the age of cell phones.

๐Ÿ“

Small Sample Sizes

Conclusions based on too few observations may not be reliable.

Rule of thumb: Larger samples generally provide more reliable results.

๐Ÿ“Š

Misinterpreting Statistics

Understanding what statistics actually measure and their limitations.

Example: "Average" doesn't always represent typical values if there are extreme outliers.

Real-World Applications

Business & Economics

  • Market research and consumer behavior
  • Quality control in manufacturing
  • Financial risk assessment
  • A/B testing for websites

Healthcare & Medicine

  • Clinical trial design
  • Epidemiological studies
  • Drug effectiveness testing
  • Public health surveillance

Social Sciences

  • Political polling and elections
  • Educational assessment
  • Psychological research
  • Social policy evaluation

Technology & Data Science

  • Machine learning algorithms
  • User experience optimization
  • Predictive analytics
  • Big data analysis

Ethics in Statistical Practice

๐Ÿค Informed Consent

Participants should understand the study's purpose, procedures, and potential risks before agreeing to participate.

๐Ÿ”’ Confidentiality

Protect participant privacy and ensure data security. Aggregate data when possible to prevent identification.

๐Ÿ“‹ Honest Reporting

Report results accurately, including limitations and potential biases. Don't cherry-pick favorable results.

โš–๏ธ Avoiding Harm

Consider the potential impact of research on participants and society. Weigh benefits against risks.

Case Study: Misleading Statistics

A company reports "90% customer satisfaction" but only surveyed customers who made repeat purchases, ignoring dissatisfied customers who never returned.

Ethical issue: Misleading representation through biased sampling

The Future of Statistics

Summary and Key Takeaways

๐ŸŽฏ Statistics is Everywhere

Statistical thinking is essential for making informed decisions in all areas of life

๐Ÿ“Š Two Main Branches

Descriptive statistics summarize data; inferential statistics help us make predictions and decisions

๐Ÿ”ฌ Research Methods Matter

The type of study (observational vs. experimental) determines what conclusions we can draw

๐ŸŽฒ Sampling is Crucial

Good sampling methods are essential for reliable statistical inference

โš–๏ธ Ethics and Responsibility

Statistical practice requires integrity, honesty, and consideration of societal impact

What's Next?

In upcoming lectures, we'll dive deeper into:

  • Data visualization techniques
  • Measures of central tendency and variability
  • Probability theory foundations
  • Statistical inference methods

Interactive Review

Quick Concept Check

Test your understanding of today's key concepts:

Match the Study Type

Researchers observe smoking habits and lung cancer rates
Scientists randomly assign patients to receive either a new drug or placebo
A report shows the average SAT scores by state
A poll predicts election results based on a sample of voters
Observational Study
Designed Experiment
Descriptive Statistics
Inferential Statistics

Thank You!

The Nature of Statistics - Complete

๐ŸŽ“

Statistical Foundations Mastered

Ready for Advanced Statistical Methods

Next Steps in Your Statistical Journey

Practice

Complete textbook exercises 1.1-1.4

Apply

Find examples of statistics in current news

Prepare

Review for upcoming data visualization lecture

Connect

Join the discussion forum for Q&A

Remember: Statistical literacy is not just about numbers - it's about making better decisions based on evidence!

Questions & Discussion

Feel free to reach out during office hours or via email for any clarifications!

Office Hours: D312, by appointment

Email: sorujov@ada.edu.az