R code for Density Estimation using Shape-Restricted Regression Splines

DENSITY ESTIMATION USING SHAPE-RESTRICTED REGRESSION SPLINES

Non-parametric estimation of probability distributions with shape and smoothness assumptions concerning the density or the hazard function

If monotonicity is imposed, the quadratic B-spline basis functions are used. If the constraints include convexity, the cubic B-spline basis functions are used.

For decreasing density estimation, the arguments are sorted data, number of interior knots

The algorithm returns a pair ans$xpl, ans$theta that are the x and y coordinates for the density estimate, and the knot placement values

R code for decreasing smooth density estimation using quadratic regression splines

Here is an example implementation, fitting a density to a dataset generated from an exponential function.

xdat=sort(abs(rnorm(50)))

ans=dspl(xdat,4)

plot(c(0,max(xdat)),c(0,max(ans$theta)),pch=" ")

lines(ans$xpl,ans$theta)

points(xdat,1:50*0,pch="|")

points(ans$knots,1:6*0,pch="X",col=2)

For hazard function estimation, the arguments are the sorted observations, the censoring indicator vector where del=0 if an observation is censored and del=1 if not, the weight vector, and the number of interior knots.

The algorithm returns a pair ans$xpl and ans $theta that are the x and y coordinates of the hazard function estimate, and the knot placement values.

R code for increasing smooth hazard estimation using quadratic regression splines

Hint: The values of observed times should be distinct and not *very* close to one another. Binning might be necessary if adjacent values are less than 1/1000 or so of the range of values. Use the weight vector to specify how many observations per bin.

meyer@stat.colostate.edu