Formally, a random variable is a function that assigns a real number to each outcome in the probability space. Define your own discrete random variable for the uniform probability space on the right and sample to find the empirical distribution.
Click and drag to select sections of the probability space, choose a real number value, then press "Submit. Choose one of the following major discrete distributions to visualize. A continuous random variable takes on an uncountably infinite number of possible values e.
Choose one of the following major continuous distributions to visualize. It is frequently used to represent binary experiments, such as a coin toss.
It is frequently used to model the number of successes in a specified number of identical binary experiments, such as the number of heads in five coin tosses. For example, this distribution could be used to model the number of heads that are flipped before three tails are observed in a sequence of coin tosses. For example, this distribution can be used to model the number of times a die must be rolled in order for a six to be observed.
This distribution has been used to model events such as meteor showers and goals in a soccer match. The uniform distribution is a continuous distribution such that all intervals of equal length on the distribution's support have equal probability. For example, this distribution might be used to model people's full birth dates, where it is assumed that all times in the calendar year are equally likely.
The normal or Gaussian distribution has a bell-shaped density function and is used in the sciences to represent real-valued random variables that are assumed to be additively produced by many small effects. For example the normal distribution is used to model people's height, since height can be assumed to be the result of many small genetic and evironmental factors.
Student's t-distribution, or simply the t-distribution, arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.
It is often used in hypothesis testing and in the construction of confidence intervals. The exponential distribution is the continuous analogue of the geometric distribution.
It is often used to model waiting times. The F-distribution, also known as the Fisher—Snedecor distribution, arises frequently as the null distribution of a test statistic, most notably in the analysis of variance. The gamma distribution is a general family of continuous probability distributions. The exponential and chi-squared distributions are special cases of the gamma distribution.
The beta distribution is a general family of continuous probability distributions bound between 0 and 1. The beta distribution is frequently used as a conjugate prior distribution in Bayesian statistics.
The larger the sample, the better the approximation. Choose the sample size and how many sample means should be computed draw numberthen press "Sample. This visualization was adapted from Philipp Plewa's fantastic visualization of the central limit theorem. Chapter 3 Probability Distributions A probability distribution specifies the relative likelihoods of all possible outcomes.
Random Variables Formally, a random variable is a function that assigns a real number to each outcome in the probability space. Color Value 0. Discrete and Continuous There are two major classes of probability distributions.Many probability distributions that are important in theory or applications have been given specific names. For any set of independent random variables the probability density function of their joint distribution is the product of their individual density functions.
From Wikipedia, the free encyclopedia. Wikipedia list article. Outline Index. Descriptive statistics. Mean arithmetic geometric harmonic Median Mode. Central limit theorem Moments Skewness Kurtosis L-moments. Index of dispersion. Grouped data Frequency distribution Contingency table.
Data collection. Sampling stratified cluster Standard error Opinion poll Questionnaire. Scientific control Randomized experiment Randomized controlled trial Random assignment Blocking Interaction Factorial experiment. Adaptive clinical trial Up-and-Down Designs Stochastic approximation. Cross-sectional study Cohort study Natural experiment Quasi-experiment. Statistical inference. Z -test normal Student's t -test F -test. Bayesian probability prior posterior Credible interval Bayes factor Bayesian estimator Maximum posterior estimator.
Correlation Regression analysis.
Pearson product-moment Partial correlation Confounding variable Coefficient of determination. Simple linear regression Ordinary least squares General linear model Bayesian regression. Regression Manova Principal components Canonical correlation Discriminant analysis Cluster analysis Classification Structural equation model Factor analysis Multivariate distributions Elliptical distributions Normal.
Spectral density estimation Fourier analysis Wavelet Whittle likelihood. Nelson—Aalen estimator. Log-rank test. Cartography Environmental statistics Geographic information system Geostatistics Kriging.
Probability distributions List. Benford Bernoulli beta-binomial binomial categorical hypergeometric Poisson binomial Rademacher soliton discrete uniform Zipf Zipf—Mandelbrot.
Cauchy exponential power Fisher's z Gaussian q generalized normal generalized hyperbolic geometric stable Gumbel Holtsmark hyperbolic secant Johnson's S U Landau Laplace asymmetric Laplace logistic noncentral t normal Gaussian normal-inverse Gaussian skew normal slash stable Student's t type-1 Gumbel Tracy—Widom variance-gamma Voigt.
Discrete Ewens multinomial Dirichlet-multinomial negative multinomial Continuous Dirichlet generalized Dirichlet multivariate Laplace multivariate normal multivariate stable multivariate t normal-inverse-gamma normal-gamma Matrix-valued inverse matrix gamma inverse-Wishart matrix normal matrix t matrix gamma normal-inverse-Wishart normal-Wishart Wishart. Degenerate Dirac delta function Singular Cantor. Circular compound Poisson elliptical exponential natural exponential location—scale maximum entropy mixture Pearson Tweedie wrapped.
Categories : Statistics-related lists. Hidden categories: Articles with short description. Namespaces Article Talk. Views Read Edit View history. Help Community portal Recent changes Upload file. Download as PDF Printable version. Wikimedia Commons.In my first and second introductory posts I covered notation, fundamental laws of probability and axioms.
These are the things that get mathematicians excited. However, probability theory is often useful in practice when we use probability distributions. Probability distributions are used in many fields but rarely do we explain what they are.
Often it is assumed that the reader already knows I assume this more than I should. For example, a random variable could be the outcome of the roll of a die or the flip of a coin. A probability distribution is a list of all of the possible outcomes of a random variable along with their corresponding probability values. To give a concrete example, here is the probability distribution of a fair 6-sided die.
To be explicit, this is an example of a discrete univariate probability distribution with finite support. I can have an outcome of 1. It gets weird. You can probably guess when we get to continuous probability distributions this is no longer the case. In this case, we only have the outcome of the die roll. In contrast, if we have more than one variable then we say that we have a multivariate distribution. The support is essentially the outcomes for which the probability distribution is defined.
So the support in our example is. And since this is not an infinite number of values, it means that the support is finite. In the above example of rolling a six-sided die, there were only six possible outcomes so we could write down the entire probability distribution in a table.
In many scenarios, the number of outcomes can be much larger and hence a table would be tedious to write down. Worse still, the number of possible outcomes could be infinite, in which case, good luck writing a table for that. To get around the problem of writing a table for every distribution, we can define a function instead. The function allows us to define a probability distribution succinctly. On a very abstract level, a function is a box that takes an input and returns an output.
For the vast majority of cases, the function actually has to do something with the input for the output to be useful. Graphically our function as a box looks like this:. Now it would be tedious to draw the digram above for every function that we want to create.
So the above diagram can now be written as. This is better, however, we still have the problem that we have to draw a diagram to understand what the function is doing. We can define our function mathematically as. One of the main takeaways from this is that with a function we can see how we would transform any input.
For example, we could write a function in a programming language that takes a string of text as input and outputs the first letter of that string. Here is an example of this function in the Python programming language.
Given that one of the main benefits of functions is to allow us to know how to transform any input, we can also use this knowledge to visualise the function explicitly.
Graphically it looks like this:. One of the most important features of functions are parameters. The reason that parameters are important is that they play a direct role in determining the output. This difference means that the outputs we get are completely different for the same input.A distribution represent the possible values a random variable can take and how often they occur.
The more overfilled the mid of the distribution, the more data falls within that interval as show in figure. A comparison table showing difference between discrete distribution and continuous distribution is given here.
In Bernoulli distribution there is only one trial and only two possible outcomes i. A sequence of identical Bernoulli events is called Binomial and follows a Binomial distribution. In uniform distribution all the outcomes are equally likely.
Poisson distribution is used to determine how likelihood a certain event occur over a given interval of time or distance. It shows a distribution that most natural events follow. The main characteristics of normal distribution are:. Chi-Squared distribution is frequently being used.
It is mostly used to test wow of fit. It is usually observed in events which considerably change early on. It is used to observe how continuous variable inputs can affect the probability of a binary result. Download: Types of Probability Distribution pdf. Save my name, email, and website in this browser for the next time I comment.
Notify me of follow-up comments by email. Notify me of new posts by email. Data Science. What is distribution? Tags: data sciencedata science tutoriallearn data scienceprobability. Related Posts. One Response theasty. Leave a Reply Cancel reply Save my name, email, and website in this browser for the next time I comment.
We cannot add up individual values to find out the probability of an interval because there are many of them.A probability distribution is a statistical function that describes all the possible values and likelihoods that a random variable can take within a given range. This range will be bounded between the minimum and maximum possible values, but precisely where the possible value is likely to be plotted on the probability distribution depends on a number of factors.
These factors include the distribution's mean averagestandard deviationskewnessand kurtosis. Perhaps the most common probability distribution is the normal distribution, or " bell curve ," although several distributions exist that are commonly used.
Typically, the data generating process of some phenomenon will dictate its probability distribution. This process is called the probability density function. Academics, financial analysts and fund managers alike may determine a particular stock's probability distribution to evaluate the possible expected returns that the stock may yield in the future.
The stock's history of returns, which can be measured from any time interval, will likely be composed of only a fraction of the stock's returns, which will subject the analysis to sampling error. By increasing the sample size, this error can be dramatically reduced. There are many different classifications of probability distributions.
Some of them include the normal distribution, chi square distribution, binomial distributionand Poisson distribution. The different probability distributions serve different purposes and represent different data generation processes. The binomial distribution, for example, evaluates the probability of an event occurring several times over a given number of trials and given the event's probability in each trial.
Another typical example would be to use a fair coin and figuring the probability of that coin coming up heads in 10 straight flips. A binomial distribution is discreteas opposed to continuous, since only 1 or 0 is a valid response. The most commonly used distribution is the normal distribution, which is used frequently in finance, investing, science, and engineering.
The normal distribution is fully characterized by its mean and standard deviation, meaning the distribution is not skewed and does exhibit kurtosis.
This makes the distribution symmetric and it is depicted as a bell-shaped curve when plotted. A normal distribution is defined by a mean average of zero and a standard deviation of 1. Unlike the binomial distribution, the normal distribution is continuous, meaning that all possible values are represented as opposed to just 0 and 1 with nothing in between. Stock returns are often assumed to be normally distributed but in reality, they exhibit kurtosis with large negative and positive returns seeming to occur more than would be predicted by a normal distribution.
In fact, because stock prices are bounded by zero but offer a potential unlimited upside, the distribution of stock returns has been described as log-normal. This shows up on a plot of stock returns with the tails of the distribution having greater thickness.
Probability distributions are often used in risk management as well to evaluate the probability and amount of losses that an investment portfolio would incur based on a distribution of historical returns. One popular risk management metric used in investing is value-at-risk VaR. VaR yields the minimum loss that can occur given a probability and time frame for a portfolio. Alternatively, an investor can get a probability of loss for an amount of loss and time frame using VaR.
Misuse and overreliance on VaR has been implicated as one of the major causes of the financial crisis. As a simple example of a probability distribution, let us look at the number observed when rolling two standard six-sided dice.
Tools for Fundamental Analysis. Advanced Technical Analysis Concepts.
List of probability distributions
Financial Ratios. Risk Management. Portfolio Management.
Hedge Funds Investing. Your Money. Personal Finance.Although this may sound like something technical, the phrase probability distribution is really just a way to talk about organizing a list of probabilities.
A probability distribution is a function or rule that assigns probabilities to each value of a random variable. The distribution may in some cases be listed. In other cases, it is presented as a graph. Suppose that we roll two dice and then record the sum of the dice.
Sums anywhere from two to 12 are possible. Each sum has a particular probability of occurring. We can simply list these as follows:. This list is a probability distribution for the probability experiment of rolling two dice.
We can also consider the above as a probability distribution of the random variable defined by looking at the sum of the two dice. A probability distribution can be graphed, and sometimes this helps to show us features of the distribution that were not apparent from just reading the list of probabilities.
The random variable is plotted along the x -axis, and the corresponding probability is plotted along the y -axis. For a discrete random variable, we will have a histogram.
For a continuous random variable, we will have the inside of a smooth curve. The rules of probability are still in effect, and they manifest themselves in a few ways. Since probabilities are greater than or equal to zero, the graph of a probability distribution must have y -coordinates that are nonnegative.
Another feature of probabilities, namely that one is the maximum that the probability of an event can be, shows up in another way. The graph of a probability distribution is constructed in such a way that areas represent probabilities. For a discrete probability distribution, we are really just calculating the areas of rectangles. In the graph above, the areas of the three bars corresponding to four, five and six correspond to the probability that the sum of our dice is four, five or six.
The areas of all of the bars add up to a total of one. In the standard normal distribution or bell curve, we have a similar situation. The area under the curve between two z values corresponds to the probability that our variable falls between those two values. For example, the area under the bell curve for -1 z. There are literally infinitely many probability distributions.
A list of some of the more important distributions follows:. Share Flipboard Email. Courtney Taylor. Professor of Mathematics.
Courtney K. Taylor, Ph. Updated March 06, Binomial distribution — Gives the number of successes for a series of independent experiments with two outcomes Chi-square distribution — For use of determining how close observed quantities fit a proposed model F-distribution — Used in the analysis of variance ANOVA Normal distribution — Called the bell curve and is found throughout statistics.Last Updated on November 14, Probability can be used for more than calculating the likelihood of one event; it can summarize the likelihood of all possible outcomes.
A Gentle Introduction to Probability Distributions
A thing of interest in probability is called a random variable, and the relationship between each possible outcome for a random variable and their probabilities is called a probability distribution. Probability distributions are an important foundational concept in probability and the names and shapes of common probability distributions will be familiar. The structure and type of the probability distribution varies based on the properties of the random variable, such as continuous or discrete, and this, in turn, impacts how the distribution might be summarized or how to calculate the most likely outcome and its probability.
Discover bayes opimization, naive bayes, maximum likelihood, distributions, cross entropy, and much more in my new bookwith 28 step-by-step tutorials and full Python source code. A random variable is a quantity that is produced by a random process.
In probability, a random variable can take on one of many possible values, e. A specific value or set of values for a random variable can be assigned a probability. In probability modeling, example data or instances are often thought of as being events, observations, or realizations of underlying random variables. A random variable is often denoted as a capital letter, e.
Xand values of the random variable are denoted as a lowercase letter and an index, e. Upper-case letters like X denote a random variable, while lower-case letters like x denote the value that the random variable takes. The values that a random variable can take is called its domain, and the domain of a random variable may be discrete or continuous. Variables in probability theory are called random variables and their names begin with an uppercase letter.
A discrete random variable has a finite set of states: for example, colors of a car. A random variable that has values true or false is discrete and is referred to as a Boolean random variable: for example, a coin toss. A continuous random variable has a range of numerical values: for example, the height of humans.
The probability of a random variable is denoted as a function using the upper case P or Pr ; for example, P X is the probability of all values for the random variable X. As a distribution, the mapping of the values of a random variable to a probability has a shape when all values of the random variable are lined up. The distribution also has general properties that can be measured.
Two important properties of a probability distribution are the expected value and the variance. Mathematically, these are referred to as the first and second moments of the distribution. Other moments include the skewness 3rd moment and the kurtosis 4th moment.