Probability space is defined by sample space , event space , and probabilities .

  1. The sample space is the set of all possible outcomes.
  2. The event space is the set of possible results, each result being a set of outcomes from the sample space.
  3. Probabilities are associated with each event to measure how likely it is for to occur.

Probability distributions use probability space to describe how likely a ๐Ÿช Random Variable is to take on each of its different states. These states belong in the target space .

For random variable and subset , we can associate a probability based on the probability of the particular event corresponding to . Mathematically,

These probabilities can be either discrete or continuous.

  1. If is discrete, we can specify for using the probability mass function.
  2. If is continuous, we instead specify using the cumulative distribution function.

Types of Distributions

Distributions with random variables and can have multiple types.

  1. Joint probability models probabilities for each pair .
  2. Marginal probability for finds probabilities for irrespective of what value is taken for . This is defined as .
  3. Conditional probability considers only instances for to find probabilities for and is written as .

Discrete Probabilities

With a discrete target space, each outcome has its own probability, so our probability mass function

where is the number of events with state and is the total number of events.

Continuous Probabilities

In continuous target space, we need functions to define probabilities rather than explicit fractions. The probability density function is non-negative and satisfies

Using , we can find

This value is captured by the cumulative distribution function,

Probability Rules

Given a joint distribution , we can use the sum rule to find the marginal:

Info

Many computational challenges in probabilistic modeling come from the sum rule. If we have many random variables, itโ€™s computationally expensive to calculate the sum or integral over them.

We can also relate the joint to the conditional and marginal using the product rule,

Using this property, we can derive ๐Ÿช™ Bayesโ€™ Theorem,