Combinatorics
Probability & Statistics
© The scientific sentence. 2010
|
Probability parameters
1. Distributions
let's consider again the example of rolling two six-sided dice.
The parent population is:
S = {(1,1),(1,2),(1,3),(1,4),(1,5),(1,6),(2,1),(2,2),(2,3),(2,4),(2,5),(2,6),
(3,1),(3,2),(3,3),(3,4),(3,5),(3,6),(4,1),(4,2),(4,3),(4,4),(4,5),(4,6),
(5,1),(5,2),(5,3),(5,4),(5,5),(5,6),(6,1),(6,2),(6,3),(6,4),(6,5),(6,6)}
All of the elements of S have the same prbability to occur. They are
equally likely with the probility equal to 1/36. If sj is an elemnt of S
(j: 1 → 36), and pj = p(sj) its probability of occuring, we have
the following criteria that each probability must satisfy:
0 ≤ pj ≤ 1
∑ pj = 1
Any collection of numbers pj, that satisfy the above dcriteria is
called a probability distribution. When it is associated with
the parent population, it is called parent distribution.
In the table 2, the probabilities
{1/36,2/36,3/36,4/36,5/36,6/36,5/36,4/36,3/36,2/36,1/36}
constitute a probabily distribution.
Note also, from the empirical result (2.2), if sj occurs nj times,
nj satisfy the following:
0 ≤ nj ≤ N
∑ nj = N
The relative frequencies fj = nj/N satisfy the above criteria.
The collection {fj} is the empirical distribution
The connection between the empirical distribution and the parent
distribution is:
lim fj = pj
N → ∞
The probability distribustion is discrte. When the
random variables are continuous, we talk about a
density probability
2. Distributions parameters
Generally, we have the same profile for probability distribustions
as shown in the following curve:
That plots the relative intensity inside a black body
with respect to the wavelenth of a radiation that the black body
receives of the gas at temperature T = 5000 oK.
In this graph, we will define two parameters, the location
and the dispersion.
2.1. Location
The location of apopulation is represented by three
parameters: mode. median, and mean
1. The mode is the most probable value to occur.
2. The median divides the area under the curve and the horizontal
axis into two equal regions; the right and the left. In the right
region, as well as in the left, we have 50% of chance to have
larger value and 50% of chance to have small value respectively.
If the distribution is discrete, the median point is the value
for which 50% of the data lies to the right and 50% of the data
lies to the left
3. The mean value is called also the average. It is the
expectation value E(X) for a random variable X associated to
an event E and related to the parent (theoritical) probility distribution.
It is written as:
μ = ∑xipi = ∑xini/N
xi is an outcome value, and ni is its number of outcome,
and N is the number of trials (ni/N is its probability).
By measurements, we have mi for each xi, then
fi = mi/N is the sample distribution. In this
case, the sample mean x is written as:
x = ∑xifi = ∑ximi/N
xi is an outcome value, and ni is its frequency ,
and N is the number of trials (ni/N is its relative frequency).
As we have seen, for large N fi → pi. Then,
in this case, x → μ
2.2. Dispersion
The dispersion or the variance shows how large data are spread
around the location.
It is denoted by σ2 for a parent distribution
and take the following expression:
σ2 = E((x - μ)2).
Its square root, σ, is called standard deviationIt shows how
large each value is distant from the neighbor other value.
For a sample distribution it is denoted by s2
and take the following expression:
s2 = (1/(N-1))∑((xi - x)2.
xi is the result of the ith measurements among N measurements.
The expression of σ2 can be transformed to:
σ2 = E((x - μ)2) = E(x 2 - 2μx + μ2) =
(1/N) ∑ (x 2 - 2μx + μ2) =
(1/N) {∑ x 2 - 2μ∑x + ∑μ2}
We have:
(1/N)∑ μ2 = (N/N) μ2 = μ2
(1/N)∑x = μ, and
(1/N)∑ x 2 = E(x2)
Then:
σ2 = E(x2) - 2μ2 + μ2 =
E(x2) - μ2
σ2 = E(x2) - μ2
|
|
|