ST 321 Random
Variables and Histograms
Try the following R/Splus commands:
sample(1:6,8,replace=T) # generate 8
random numbers from discrete uniform 1 to 6
rbinom(8,15,0.4) # generate 8 random numbers from
binomial with n=15 and
# success
probability 0.4
rgeom(8,0.4)+1 # generate 8 random numbers from
geometric with success
# probability
0.4 (note: we need to add 1 to the results
# as Splus uses a
different parametrization)
rpois(8,2.3) # generate 8 random poisson random
numbers with mean 2.3
runif(8,4,5.5) # generate 8 random numbers from
continuous uniform
# ranged from 4 to 5.5
rexp(8,0.6) # generate 8 random exponential
numbers with rate 0.6
rnorm(8,3,1.3) # generate 8 random normal numbers
with mean 3 and
# standard
deviation 1.3
We can use the “hist” command to plot the histogram of any set of random numbers generated. For examples,
x <- rnorm(1000,3,1.3)
hist(x,freq=T)
x <- rbinom(1000,20,0.3)
hist(x,freq=T)
x <- runif(1000,4,5.5)
hist(x,freq=T)
Now try plotting the histograms of other random numbers (e.g., geometric, exponential, etc).
Task 1. (a). Suppose we roll a die three times. Let X be the minimum of the three numbers obtained. The following R/Splus program estimates the probability density function (or probability mass function) of X :
estimatepdf1 <- function(N=10000)
{
freq <- seq(0,0,length=6)
#generate a vector of 6 zeros
for (i in 1:N) {
die1 <- sample(1:6,1,replace=T)
die2 <- sample(1:6,1,replace=T)
die3 <- sample(1:6,1,replace=T)
result <-
min(die1, die2, die3) #so "result" is a number between 1 to 6
freq[result] <-
freq[result]+1
}
freq <- freq/N
cat(freq,
"\n")
}
Run the above program.
(b). Suppose now we roll a die four times and let Y be the second
largest number obtained. Modify the
above program to estimate the probability function of Y. The R/Splus function sort may be useful.
Task 2. (a). (Problem 5.1.31 of the online downloadable text.) “In one of the first studies of the Poisson distribution, von Bortkiewicz considered the frequency of deaths from [mule] kicks in the Prussian army corps. From the study of 14 corps over a 20-year period, he obtained the data”:
|
# of deaths,x |
0 |
1 |
2 |
3 |
4 |
|
# of corps with x deaths in a given year |
144 |
91 |
32 |
11 |
2 |
Fit a Poisson distribution to these data and see if you think that the Poisson distribution is appropriate.
y
<- c(rep(0,144),rep(1,91),rep(2,32),rep(3,11),rep(4,2))
observed <- table(y)
lambda <- mean(y)
expected <- c(dpois(0:3,lambda),1-sum(dpois(0:3,lambda)))*length(y)
plot(0:4,observed,type="h",xlab="Number of
Deaths",ylab="Frequency")
lines(0:4,expected,col=8)
What do the following R/Splus functions do: rep, table, mean, dpois and length? In R/Splus try “help(rep)”.
(b). (Problem 5.1.26 of the same text.) “Feller discusses
the statistics of flying bomb hits in an area in the south of
|
# of hits, x |
0 |
1 |
2 |
3 |
4 |
5 or more |
|
# of squares with x hits |
229 |
211 |
93 |
35 |
7 |
1 |
Fit a Poisson distribution to these data and see if you think that the Poisson distribution is appropriate.
Task 3. Write a computer algorithm that simulates m hypergeometric random variables with parameters N=total number of balls in urn, k=number of red (i.e., N-k blue), and n=number of balls drawn. The R/Splus command for “n choose k” is “choose(n,k)”.