Plots data points using propesity scores vs. the response, separately for treatment and control groups; points are distinguished by both type and color for the two groups. Also shows (non-linear, loess-based) regression curves for both groups. The loess regresion curves are then used to derive an overall estimate of effect size (based on number and/or location of strata as set by the user). Several other statistics are also provided, for both description and inference. Graphic motivated by a suggestion of R. L. Obenchain.

loess.psa(
  response,
  treatment = NULL,
  propensity = NULL,
  family = "gaussian",
  span = 0.7,
  degree = 1,
  minsize = 5,
  xlim = c(0, 1),
  colors = c("dark blue", "dark green", "blue", "dark green"),
  legend.xy = "topleft",
  legend = NULL,
  int = 10,
  lines = TRUE,
  strata.lines = TRUE,
  rg = TRUE,
  xlab = "Estimated Propensity Scores",
  ylab = "Response",
  pch = c(16, 1),
  ...
)

Arguments

response

Either a numeric vector containing the response of interest in a propensity score analysis, or a three column array containing response, treatment and strata.

treatment

Binary variable of same length as response; 0 for 'control,' 1 for 'treatment.'

propensity

Numeric vector of estimated propensity scores.

family

Passed to loess. Either "gaussian" (default) or "symmetric".

span

Parameter passed to loess governing degree of smoothing. Default = 0.7.

degree

Parameter passed to loess governing degree of polynomials used. Default = 1

minsize

Integer. Determines the minimum number of observations in each stratum treatment group allowed. If one of the treatment groups in a given statum does not meet this minsize, then all observations in this stratum are ignored as far as the effect size calculation is concerned.

xlim

Binary vector (min, max) providing the horizontal axis minimum and maximum. Default is c(0, 1).

colors

List of four colors used for control points, treatment points, control loess line, treatment loess line respectively. Default = c("seagreen3", "goldenrod1", "seagreen4", "goldenrod3").

legend.xy

Coordinates for legend box, see legend. Default = "topleft".

legend

Binary character vector containing the text of the legend. Default is taken from treatment.

int

Integer or ordered vector. If an integer is used, it represents the maximum number of equally sized strata. Alternatively, it may be a vector of cuts of the unit interval. Lower and upper ends need not be included. See examples. Default = 10.

lines

Logical; fitted loess values are plotted by default as points. If true, values are plotted as two lines.

strata.lines

Logical; default = TRUE. Creates light vertical lines that delineate strata.

rg

Logical; if TRUE (default) then rug plots are given for treatment and control propensity score and response distributions.

xlab

X axis label, default = "Estimated Propensity Scores".

ylab

Y axis label, default = "Response".

pch

Character types for plotted points, default = c(16, 1). Note: must be of length 2 to allow different plotting points for each treatment.

...

Optional parameters passed to points command.

Value

In addition to the plot, the function returns a list with the following components:

ATE

Estimated effect size based upon (number of) strata defined by int; that is, this is the Average Treatment Effect, after propensity-based adjustment.

se.wtd

Weighted standard error based on pooling of within-strata variance estimates.

CI.95

Approximate 95% confidence interval for the overall effect size (conditional on the specification of int).

summary.strata

A table with rows corresponding to strata; first two columns show counts (by statum) for both control and treatment; followed by mean differences for all strata. for control and treatment, followed by mean differences for all strata. The weighted average difference yields the effect size noted above.

See also

Author

James E. Helmreich James.Helmreich@Marist.edu

Robert M. Pruzek RMPruzek@yahoo.com

Examples


#Artificial example where ATE should be 1 over all of (0,1).
response1 <- c(rep(1, 100), rep(2, 100), rep(3, 100)) + rnorm(300, 0, .5)
response0 <- c(rep(0, 100), rep(1, 100), rep(2, 100)) + rnorm(300, 0, .5)
response <- c(response1, response0)
treatment <- c(rep(1, 300), rep(0, 300))
propensity <- rep(seq(.01, .99, (.98/299)), 2)
a <- data.frame(response, treatment, propensity)
loess.psa(a, span = .15, degree = 1, int = c(0, .33, .67, 1))

#> $ATE
#> [1] 1.032137
#> 
#> $se.wtd
#> [1] 0.04180836
#> 
#> $CI95
#> [1] 0.9485205 1.1157539
#> 
#> $summary.strata
#>   counts.0 counts.1    means.0  means.1 diff.means
#> 1       98       98 0.01960765 1.071427  1.0518190
#> 2      104      104 0.99046930 1.978835  0.9883661
#> 3       98       98 1.99057420 3.049481  1.0589064
#> 


#Artificial example where estimates are unstable with varying
#numbers of strata. Note: sometimes get empty treatment/strata error.
rr <- c(rnorm(150, 3, .75), rnorm(700, 0, .75), rnorm(150, 3, .75),
     rnorm(150, -3, .75), rnorm(700, 0, .75), rnorm(150, -3, .75))
tt <- c(rep(1, 1000),rep(0, 1000))
pp <- NULL
for(i in 1:1000){pp <- c(pp, rnorm(1, 0, .05) + .00045*i + .25)}
for(i in 1:1000){pp <- c(pp, rnorm(1, 0, .05) + .00045*i + .4)}
a <- data.frame(rr, tt, pp)
loess.psa(a, span=.5, cex = .6)

#> $ATE
#> [1] 2.229908
#> 
#> $se.wtd
#> [1] 0.06595284
#> 
#> $CI95
#> [1] 2.098002 2.361814
#> 
#> $summary.strata
#>    counts.0 counts.1     means.0   means.1 diff.means
#> 1         6      194 -3.62525412 1.8476034  5.4728575
#> 2        37      163 -2.69650526 0.5401060  3.2366113
#> 3        84      116 -1.90239048 0.1103833  2.0127738
#> 4       104       96 -1.19364690 0.1049189  1.2985658
#> 5       104       96 -0.59154453 0.2760897  0.8676342
#> 6        87      113 -0.18171720 0.6520890  0.8338062
#> 7       102       98 -0.04066834 1.2131190  1.2537873
#> 8       117       83 -0.05978540 1.8829426  1.9427280
#> 9       163       37 -0.55506006 2.5953485  3.1504086
#> 10      196        4 -1.91705476 3.3507247  5.2677795
#> 

#Using strata of possible interest as determined by loess lines.
data(lindner)
attach(lindner)
#> The following objects are masked from lindner (pos = 3):
#> 
#>     abcix, acutemi, cardbill, diabetic, ejecfrac, female, height,
#>     lifepres, stent, ves1proc
#> The following objects are masked from lindner (pos = 4):
#> 
#>     abcix, acutemi, cardbill, diabetic, ejecfrac, female, height,
#>     lifepres, stent, ves1proc
#> The following objects are masked from lindner (pos = 5):
#> 
#>     abcix, acutemi, cardbill, diabetic, ejecfrac, female, height,
#>     lifepres, stent, ves1proc
#> The following objects are masked from lindner (pos = 6):
#> 
#>     abcix, acutemi, cardbill, diabetic, ejecfrac, female, height,
#>     lifepres, stent, ves1proc
#> The following objects are masked from lindner (pos = 7):
#> 
#>     abcix, acutemi, cardbill, diabetic, ejecfrac, female, height,
#>     lifepres, stent, ves1proc
lindner.ps <- glm(abcix ~ stent + height + female +
      diabetic + acutemi + ejecfrac + ves1proc,
      data = lindner, family = binomial)
loess.psa(log(cardbill), abcix, lindner.ps$fitted,
         int = c(.37, .56, .87, 1), lines = TRUE)
#> $ATE
#> [1] 0.1421567
#> 
#> $se.wtd
#> [1] 0.06731032
#> 
#> $CI95
#> [1] 0.007536057 0.276777324
#> 
#> $summary.strata
#>   counts.0 counts.1  means.0  means.1  diff.means
#> 1       78       77 9.364167 9.475623  0.11145618
#> 2      206      507 9.400413 9.589262  0.18884893
#> 3       12      113 9.685568 9.599461 -0.08610724
#> 
         abline(v=c(.37, 56, .87))