13.5 Transforming between Distributions

In describing the inversion method, we introduced a technique that generates samples according to some distribution by transforming canonical uniform random variables in a particular manner. Here, we will investigate the more general question of which distribution results when we transform samples from an arbitrary distribution to some other distribution with a function  f .

Suppose we are given random variables upper X Subscript i that are already drawn from some PDF p Subscript x Baseline left-parenthesis x right-parenthesis . Now, if we compute upper Y Subscript i Baseline equals y left-parenthesis upper X Subscript i Baseline right-parenthesis , we would like to find the distribution of the new random variable  upper Y Subscript i . This may seem like an esoteric problem, but we will see that understanding this kind of transformation is critical for drawing samples from multidimensional distribution functions.

The function y left-parenthesis x right-parenthesis must be a one-to-one transformation; if multiple values of x mapped to the same y value, then it would be impossible to unambiguously describe the probability density of a particular y value. A direct consequence of y being one-to-one is that its derivative must either be strictly greater than 0 or strictly less than 0, which implies that

upper P r left-brace upper Y less-than-or-equal-to y left-parenthesis x right-parenthesis right-brace equals upper P r left-brace upper X less-than-or-equal-to x right-brace comma

and therefore

upper P Subscript y Baseline left-parenthesis y right-parenthesis equals upper P Subscript y Baseline left-parenthesis y left-parenthesis x right-parenthesis right-parenthesis equals upper P Subscript x Baseline left-parenthesis x right-parenthesis period

This relationship between CDFs leads directly to the relationship between their PDFs. If we assume that y ’s derivative is greater than 0, differentiating gives

p Subscript y Baseline left-parenthesis y right-parenthesis StartFraction normal d y Over normal d x EndFraction equals p Subscript x Baseline left-parenthesis x right-parenthesis comma

and so

p Subscript y Baseline left-parenthesis y right-parenthesis equals left-parenthesis StartFraction normal d y Over normal d x EndFraction right-parenthesis Superscript negative 1 Baseline p Subscript x Baseline left-parenthesis x right-parenthesis period

In general, y ’s derivative is either strictly positive or strictly negative, and the relationship between the densities is

p Subscript y Baseline left-parenthesis y right-parenthesis equals StartAbsoluteValue StartFraction normal d y Over normal d x EndFraction EndAbsoluteValue Superscript negative 1 Baseline p Subscript x Baseline left-parenthesis x right-parenthesis period

How can we use this formula? Suppose that p Subscript x Baseline left-parenthesis x right-parenthesis equals 2 x over the domain left-bracket 0 comma 1 right-bracket , and let upper Y equals sine upper X . What is the PDF of the random variable upper Y ? Because we know that normal d y slash normal d x equals cosine x ,

p Subscript y Baseline left-parenthesis y right-parenthesis equals StartFraction p Subscript x Baseline left-parenthesis x right-parenthesis Over StartAbsoluteValue cosine x EndAbsoluteValue EndFraction equals StartFraction 2 x Over cosine x EndFraction equals StartFraction 2 arc sine y Over StartRoot 1 minus y squared EndRoot EndFraction period

This procedure may seem backward—usually we have some PDF that we want to sample from, not a given transformation. For example, we might have upper X drawn from some p Subscript x Baseline left-parenthesis x right-parenthesis and would like to compute upper Y from some distribution p Subscript y Baseline left-parenthesis y right-parenthesis . What transformation should we use? All we need is for the CDFs to be equal, or upper P Subscript y Baseline left-parenthesis y right-parenthesis equals upper P Subscript x Baseline left-parenthesis x right-parenthesis , which immediately gives the transformation

y left-parenthesis x right-parenthesis equals upper P Subscript y Superscript negative 1 Baseline left-parenthesis upper P Subscript x Baseline left-parenthesis x right-parenthesis right-parenthesis period

This is a generalization of the inversion method, since if upper X were uniformly distributed over left-bracket 0 comma 1 right-bracket then upper P Subscript x Baseline left-parenthesis x right-parenthesis equals x , and we have the same procedure as was introduced previously.

13.5.1 Transformation in Multiple Dimensions

In the general n -dimensional case, a similar derivation gives the analogous relationship between different densities. We will not show the derivation here; it follows the same form as the 1D case. Suppose we have an n -dimensional random variable upper X with density function p Subscript x Baseline left-parenthesis x right-parenthesis . Now let upper Y equals upper T left-parenthesis upper X right-parenthesis , where upper T is a bijection. In this case, the densities are related by

p Subscript y Baseline left-parenthesis y right-parenthesis equals p Subscript y Baseline left-parenthesis upper T left-parenthesis x right-parenthesis right-parenthesis equals StartFraction p Subscript x Baseline left-parenthesis x right-parenthesis Over StartAbsoluteValue upper J Subscript upper T Baseline left-parenthesis x right-parenthesis EndAbsoluteValue EndFraction comma

where StartAbsoluteValue upper J Subscript upper T Baseline EndAbsoluteValue is the absolute value of the determinant of upper T ’s Jacobian matrix, which is

Start 3 By 3 Matrix 1st Row 1st Column partial-differential upper T 1 slash partial-differential x 1 2nd Column midline-horizontal-ellipsis 3rd Column partial-differential upper T 1 slash partial-differential x Subscript n Baseline 2nd Row 1st Column vertical-ellipsis 2nd Column down-right-diagonal-ellipsis 3rd Column vertical-ellipsis 3rd Row 1st Column partial-differential upper T Subscript n Baseline slash partial-differential x 1 2nd Column midline-horizontal-ellipsis 3rd Column partial-differential upper T Subscript n Baseline slash partial-differential x Subscript n Baseline EndMatrix comma

where upper T Subscript i are defined by upper T left-parenthesis x right-parenthesis equals left-parenthesis upper T 1 left-parenthesis x right-parenthesis comma ellipsis comma upper T Subscript n Baseline left-parenthesis x right-parenthesis right-parenthesis .

13.5.2 Polar Coordinates

The polar transformation is given by

StartLayout 1st Row 1st Column x 2nd Column equals r cosine theta 2nd Row 1st Column y 2nd Column equals r sine theta period EndLayout

Suppose we draw samples from some density p left-parenthesis r comma theta right-parenthesis . What is the corresponding density p left-parenthesis x comma y right-parenthesis ? The Jacobian of this transformation is

upper J Subscript upper T Baseline equals Start 2 By 2 Matrix 1st Row 1st Column StartFraction partial-differential x Over partial-differential r EndFraction 2nd Column StartFraction partial-differential x Over partial-differential theta EndFraction 2nd Row 1st Column StartFraction partial-differential y Over partial-differential r EndFraction 2nd Column StartFraction partial-differential y Over partial-differential theta EndFraction EndMatrix equals Start 2 By 2 Matrix 1st Row 1st Column cosine theta 2nd Column minus r sine theta 2nd Row 1st Column sine theta 2nd Column r cosine theta EndMatrix comma

and the determinant is r left-parenthesis cosine squared theta plus sine squared theta right-parenthesis equals r . So p left-parenthesis x comma y right-parenthesis equals p left-parenthesis r comma theta right-parenthesis slash r . Of course, this is backward from what we usually want—typically we start with a sampling strategy in Cartesian coordinates and want to transform it to one in polar coordinates. In that case, we would have

p left-parenthesis r comma theta right-parenthesis equals r p left-parenthesis x comma y right-parenthesis period

13.5.3 Spherical Coordinates

Given the spherical coordinate representation of directions,

StartLayout 1st Row 1st Column x 2nd Column equals r sine theta cosine phi 2nd Row 1st Column y 2nd Column equals r sine theta sine phi 3rd Row 1st Column z 2nd Column equals r cosine theta comma EndLayout

the Jacobian of this transformation has determinant StartAbsoluteValue upper J Subscript upper T Baseline EndAbsoluteValue equals r squared sine theta , so the corresponding density function is

p left-parenthesis r comma theta comma phi right-parenthesis equals r squared sine theta p left-parenthesis x comma y comma z right-parenthesis period

This transformation is important since it helps us represent directions as points left-parenthesis x comma y comma z right-parenthesis on the unit sphere. Remember that solid angle is defined as the area of a set of points on the unit sphere. In spherical coordinates, we previously derived

d omega equals sine theta normal d theta Subscript Baseline normal d phi Subscript Baseline period

So if we have a density function defined over a solid angle normal upper Omega , this means that

upper P r StartSet omega Subscript Baseline element-of normal upper Omega EndSet equals integral Underscript normal upper Omega Endscripts p left-parenthesis omega Subscript Baseline right-parenthesis normal d omega Subscript Baseline period

The density with respect to theta and phi can therefore be derived:

StartLayout 1st Row 1st Column p left-parenthesis theta comma phi right-parenthesis normal d theta Subscript Baseline normal d phi Subscript 2nd Column equals p left-parenthesis omega Subscript Baseline right-parenthesis normal d omega Subscript Baseline 2nd Row 1st Column p left-parenthesis theta comma phi right-parenthesis 2nd Column equals sine theta p left-parenthesis omega Subscript Baseline right-parenthesis period EndLayout