Discrete Vs Continuous Probability Distribution

Statistics is a critical milestone to comprehend data science. In this article, we will discuss the difference between continuous and discrete probability distribution written by our student Jeeva as part of our student’s article initiative. But before deep diving into the topic, we encourage all aspiring data scientists who are looking for a full-stack data science course in Bangalore to click here and check our module. Now let’s jump right into the topic.

Random Variables

One might be aware of the famous equation of physics F = ma, where F = force, m = mass, and a = acceleration. In this equation, F and A are variables as they can take different values with time whereas M is a constant. Variables like force are predictable for a given mass and acceleration. In contrast to the idea of normal variables, if one wants to predict the face of rolling dice, it is much harder to predict! Predicting the output of dice roll is impossible irrespective of maximum supervision of the experimental environment. Predicting the amount of rainfall, head or tails on a coin flip, and stock markets are examples of random variables as they are difficult to predict and occur randomly.

Random variables are broadly classified into two categories, categorized by a number of outcomes. For example, the number of outcomes on rolling dice, possible outcomes on flipping a coin, no of people coming to a restaurant on a particular day, etc are discrete random variables. They are discrete as we can count the exact probability for all possible outcomes. More on this later. Some variables are difficult to count or approximate. For example, the amount of rainfall, the height of a person, etc. The height of a person might not be exactly 6 feet! A better scale of measurement can give even better accuracy like 6.001 feet or an even better scale of measurement can give precise height like 6.0010345 feet. There is no definite answer and it is definitely not countable. These random variables are called continuous random variables.

So random variables are of two types.

  • Discrete Random Variables
  • Continuous Random Variables

Discrete Random Variables

Discrete random variables have countable outcomes and we can assign a probability to each of the outcomes. Let’s take a simple example of a discrete random variable i.e. flipping a coin. A coin flip can result in two possible outcomes i.e. Heads or Tails. If our interest is to calculate the P(Head) then Head is my event. Since there are only two possibilities, we can still calculate the P(Tail). It is 1 – P(head). Given its a fair coin we know the P(Head) = 0.5. So P(Tail) = 1 – 0.5 = 0.5. We can also represent this analysis in the table shown below.


The above table is known as the probability mass function or PMF in statistics. Let’s take another example where we are rolling two dice at the same time and our event is to calculate the sum of two dies. The below table shows all possible outcomes for the event.

probability mass function corpnce

Here is what the PMF will look like.

pmf of two die corpnce data science

One can plot the PMF with the help of a simple bar graph. Here x-axis of the bar graph will be the outcomes and the y-axis will be the probability associated with each outcome. Here is what it looks like.

pmf graph

Discrete random variables are more common than we think. A lot of statistical research has been done to help us predict these random variables. Discrete random variables can be generalized through distributions. A distribution is a fancy word to convey the probability of all possible outcomes in a mathematical model. These distributions act as templates to predict the probability given certain conditions. Here is a list of a few common discrete probability distributions that we will use frequently in data science.

Bernoulli Distribution

Bernoulli distribution is a discrete probability distribution. It expects a binary outcome and a one-time experiment. It is a simple distribution for understanding complex scenarios like binomial and multinomial distributions. Flipping a coin only one time is a good example of Bernoulli! The coin has binary outcomes i.e. Heads or Tails. The first criterion i.e. binary output is met.  We are flipping the coin only one time, so the number of experiments is fixed to one. So it is a Bernoulli distribution. Another example is if an American has health insurance. Here again, the outcome is binary in nature, if the American has health insurance or he hasn’t. The subject is only one American, which means the experiment is conducted only once. So, this is a Bernoulli distribution.

For any distribution, mean and variance are important statistics. PMF makes the mean and variance calculation pretty straightforward. Multiplying the outcomes with the respective probabilities results in the mean, or also known as expected values. Expected value comes from the idea of expectation, as things may or may not happen in probability. Assuming we flipped the coin for Heads, the PMF will have two probabilities. One for the event i.e. P and the other for the compliment of the event i.e. 1-P.

The occurrence of an event is an experimental success, so Heads is a YES or 1 and Tails is a NO or 0. Multiplying horizontally results in P i.e. 1 * P + 0 * (1 – P). This proves mean of the Bernoulli distribution is the same as the probability of an event! The variance of a distribution is how far each outcome is from its expected value times the respective probability. Variance for Bernoulli is P(1 – P). Bernoulli helps in answering questions like Yes or NO.

Binomial Distribution

The binomial distribution is an extension of Bernoulli. A discrete probability distribution is binomial if the number of outcomes is binary and the number of experiments is more than two. Rolling a dice 4 times can not be a binomial distribution. Here the number of outcomes is 6! Flipping a coin 1000 times is a binomial distribution. Here the number of experiments is n = 1000. If the probability of a random American having health insurance is .89, Checking the probability of 3 Americans having health insurance out of 10 Americans is a good example. Here P = 0.89. The number of successes i.e k is 3. Number of experiments i.e. n = 10. Solutions to this kind of problem are easy with little help from permutations and combinations. Assume the first three Americans we select have health insurance and the next 7 don’t. The first three Americans are success in our experiment and 7 are not. The first three will have a probability of 0.89 i.e. P and the seven will have a probability of .11 i.e. 1-P. Multiplying the probabilities and considering all possible arrangements we will get the below equation.

binomial distribution corpnce

The mean of the binomial distribution is n * P and the variance is nP(1-P). One can observe Binomial is just an extension of Bernoulli.

Poisson Distribution

Poisson distribution is a discrete probability distribution. When one needs to calculate a number of discrete events in a continuous time interval Poisson is a good option. For example, the number of people coming to a restaurant in the next few hours, and the number of lottery winners in Bangalore are Poisson distributions. Poisson distribution expects occurrences to be random and independent. The number of trains arriving at a railway station is not Poisson as trains come at a particular interval not randomly! The number of days between two solar eclipses is not Poisson as we are looking for the number of continuous intervals in discrete events. This is the opposite scenario of Poisson and we can use exponential distribution instead.

P(Y=y) denotes the probability of y occurrences in a specific time interval. λ denotes the mean number of occurrences in an interval. y denotes the number of trials. Suppose 1 in 5000 light bulbs are defective. Let X denote the number of defective bulbs in a group size of 10,000. What is the chance that at least 3 of them are defective? This question is asking for the number of discrete events in a continuous time interval. We can apply the Poisson.

We have seen some of the most common discrete probability distributions. In the next article, we will see continuous probability distributions and how they are different. Don’t forget to check our other blogs. Happy learning!