Geometric Distribution: Definition, Properties and Applications

The geometric distribution is a discrete probability distribution that illustrates the probability that a Bernoulli trial will result in multiple failures before success. A Bernoulli trial is an experiment that can have only two possible outcomes, i.e., success or failure. In a geometric distribution, a Bernoulli trial is essentially repeated until success is attained.

More precisely, The geometric distribution is a fundamental probability distribution in statistics, widely used in modeling scenarios where we analyze the number of trials needed for the first success in a sequence of independent Bernoulli trials. This distribution is crucial in probability theory and real-world applications such as quality control, reliability testing, and sports analytics.

In this article, we will explore the definition, formula, properties, and practical applications of this distribution.

 
Geometric Distribution

What is Geometric Distribution?

The geometric distribution describes the probability of the first success occurring on the k-th trial in a sequence of independent Bernoulli trials, where each trial has only two possible outcomes: success or failure. It is a discrete probability distribution.

Types

There are two common variations:

  1. Trials-Based (or Shifted): The number of trials required until the first success, including the success trial.
  2. Failures-Based: The number of failures before the first success.

Geometric Distribution Formula

 
A discrete random variable X is said to have a geometric distribution if its probability density function is defined as,
 

    \[ f(x;p)=pq^{x}; x=0,1,2,...,\infty  \]

 
 where p is the only parameter of of geometric distribution which satisfy  0<=p<=1 and p+q=1.
 
 

Assumptions

There are three main assumptions.

  • The trials must be independent. 
  • Each experiment can only result in one of two outcomes: success or failure. 
  • For each trial, the probability of success is denoted by p.
Geometric distribution curve

Properties of Geometric Distribution

  • Geometric distribution follows the lack memory property.
  • The mean of G. D. is .
  • The variance is .
  • Moment generating function is .
  • The mean is smaller then its variance, since q/p2 > q/p.

Applications

The geometric distribution is applied in various real-world scenarios, including:

  1. Quality Control & Reliability Engineering: Estimating the number of defective products before finding a non-defective one.
  2. Customer Service & Call Centers: Modeling the number of calls before reaching a successful resolution.
  3. Biology & Medicine: Determining the number of trials before a specific event occurs, such as gene mutations or disease outbreaks.
  4. Sports Analytics: Analyzing the number of attempts before a player scores a goal or a team wins a match.
  5. Marketing & Business: Studying customer interactions before making a successful sale.

Binomial Distribution vs. Geometric Distribution

Both the binomial distribution and geometric distribution are discrete probability distributions based on Bernoulli trials, but they differ in their focus and application:

FeatureBinomial DistributionGeometric Distribution
DefinitionModels the number of successes in a fixed number of trialsModels the number of trials until the first success
Random VariableNumber of successes in n trialsNumber of trials required for the first success
Probability Mass Function (PMF)
Parametersn (number of trials), p (probability of success)p (probability of success)
ExampleThe number of heads in 10 coin flipsThe number of flips until the first heads

The geometric distribution is a crucial probability distribution used to model scenarios where we wait for the first success in a sequence of independent trials. Its memoryless property makes it unique and applicable in various industries, from engineering to sports analytics. Understanding its formula, properties, and real-world applications allows data analysts, statisticians, and business professionals to make informed decisions based on probabilistic modeling. Data Science Blog

Problem 1: If a patient is waiting for a suitable blood donor and the probability that the selected donor will be a match is 0.2, then find the expected number of donors who will be tested till a match is found including the matched donor.

Solution:

Given,
p = 0.2
E[X] = 1 / p 
= 1 / 0.2 
= 5

The expected number of donors who will be tested till a match is found is

Problem 2: If the probability of breaking the pot in the pool is 0.4, find the number of brakes before success and the corresponding variance and standard deviation.

Solution:

Here,
X ∼ geo(0.4)

Hence,
e(x) = 1/0.4 = 2.5
Var(x) = 0.6/0.4²
= 3.75

Hence, standard deviation ( σ) = 1.94

Share This:

You cannot copy content of this page