Package 'factormodel' reference manual

Title:	Factor Model Estimation Using Proxy Variables
Description:	Functions to estimate a factor model using discrete and continuous proxy variables. The function 'dproxyme' estimates a factor model of discrete proxy variables using an EM algorithm (Dempster, Laird, Rubin (1977) <doi:10.1111/j.2517-6161.1977.tb01600.x>; Hu (2008) <doi:10.1016/j.jeconom.2007.12.001>; Hu(2017) <doi:10.1016/j.jeconom.2017.06.002> ). The function 'cproxyme' estimates a linear factor model (Cunha, Heckman, and Schennach (2010) <doi:10.3982/ECTA6551>).
Authors:	Yujung Hwang [aut, cre]
Maintainer:	Yujung Hwang <[email protected]>
License:	GPL-3
Version:	1.0
Built:	2025-03-21 03:33:10 UTC
Source:	https://github.com/yujunghwang/factormodel

cproxyme

Description

This function estimates a linear factor model using continuous variables. The linear factor model to estimate has the following form. proxy = intercept + factorloading * (latent variable) + measurement error The measurement error is assumed to follow a Normal distribution with a mean zero and a variance, which needs to be estimated.

Usage

cproxyme(dat, anchor = 1, weights = NULL)
cproxyme(dat, anchor = 1, weights = NULL)

Arguments

`dat`	A proxy variable data frame list.
`anchor`	This is a column index of an anchoring proxy variable. Default is 1. That is, the code will use the first column in dat data frame as an achoring variable.
`weights`	An optional weight vector

Value

Returns a list of 3 components :

alpha0: This is a vector of intercepts in a linear factor model. The k-th entry is the intercept of k-th proxy variable factor model.
alpha1: This is a vector of factor loadings. The k-th entry is the factor loading of k-th proxy variable. The factor loading of anchoring variable is normalized to 1.
varnu: This is a vector of variances of measurement errors in proxy variables. The k-th entry is the variance of k-th proxy measurement error. The measurement error is assumed to follow a Normal distribution with mean 0.
mtheta: This is a mean of the latent variable. It is equal to the mean of the anchoring proxy variable.
vartheta: This is a variance of the latent variable.

Author(s)

Yujung Hwang, [email protected]

References

Cunha, F., Heckman, J. J., & Schennach, S. M. (2010): Estimating the technology of cognitive and noncognitive skill formation. Econometrica, 78(3), 883-931. doi:10.3982/ECTA6551
Hwang, Yujung (2021): Bounding Omitted Variable Bias Using Auxiliary Data. Working Paper. doi:10.2139/ssrn.3866876

Examples

dat1 <- data.frame(proxy1=c(1,2,3),proxy2=c(0.1,0.3,0.6),proxy3=c(2,3,5))
cproxyme(dat=dat1,anchor=1)
## you can specify weights
cproxyme(dat=dat1,anchor=1,weights=c(0.1,0.5,0.4))

dat1 <- data.frame(proxy1=c(1,2,3),proxy2=c(0.1,0.3,0.6),proxy3=c(2,3,5))
cproxyme(dat=dat1,anchor=1)
## you can specify weights
cproxyme(dat=dat1,anchor=1,weights=c(0.1,0.5,0.4))

dproxyme

Description

This function estimates measurement stochastic matrices of discrete proxy variables.

Usage

dproxyme(
  dat,
  sbar = 2,
  initvar = 1,
  initvec = NULL,
  seed = 210313,
  tol = 0.005,
  maxiter = 200,
  miniter = 10,
  minobs = 100,
  maxiter2 = 1000,
  trace = FALSE,
  weights = NULL
)
dproxyme(
  dat,
  sbar = 2,
  initvar = 1,
  initvec = NULL,
  seed = 210313,
  tol = 0.005,
  maxiter = 200,
  miniter = 10,
  minobs = 100,
  maxiter2 = 1000,
  trace = FALSE,
  weights = NULL
)

Arguments

`dat`	A proxy variable data frame list.
`sbar`	A number of discrete types. Default is 2.
`initvar`	A column index of a proxy variable to initialize the EM algorithm. Default is 1. That is, the proxy variable in the first column of "dat" is used for initialization.
`initvec`	This vector defines how to group the initvar to initialize the EM algorithm.
`seed`	Seed. Default is 210313 (birthday of this package).
`tol`	A tolerance for EM algorithm. Default is 0.005.
`maxiter`	A maximum number of iterations for EM algorithm. Default is 200.
`miniter`	A minimum number of iterations for EM algorithm. Default is 10.
`minobs`	Compute likelihood of a proxy variable only if there are more than "minobs" observations. Default is 100.
`maxiter2`	Maximum number of iterations for "multinom". Default is 1000.
`trace`	Whether to trace EM algorithm progress. Default is FALSE.
`weights`	An optional weight vector

Value

Returns a list of 5 components :

M_param: This is a list of estimated measurement (stochastic) matrices. The k-th matrix is a measurement matrix of a proxy variable saved in the kth column of dat data frame (or matrix). The ij-th element in a measurement matrix is the conditional probability of observing j-th (largest) proxy response value conditional on that the latent type is i.
M_param_col: This is a list of column labels of 'M_param' matrices
M_param_row: This is a list of row labels of 'M_param' matrices. It is simply c(1:sbar).
mparam: This is a list of multinomial logit coefficients which were used to compute 'M_param' matrices. These coefficients are useful to compute the likelihood of proxy responses.
typeprob: This is a type probability matrix of size N-by-sbar. The ij-th entry of this matrix gives the probability of observation i to have type j.

Author(s)

Yujung Hwang, [email protected]

References

Dempster, Arthur P., Nan M. Laird, and Donald B. Rubin (1977): "Maximum likelihood from incomplete data via the EM algorithm." Journal of the Royal Statistical Society: Series B (Methodological) 39.1 : 1-22. doi:10.1111/j.2517-6161.1977.tb01600.x
Hu, Yingyao (2008): Identification and estimation of nonlinear models with misclassification error using instrumental variables: A general solution. Journal of Econometrics, 144(1), 27-61. doi:10.1016/j.jeconom.2007.12.001
Hu, Yingyao (2017): The econometrics of unobservables: Applications of measurement error models in empirical industrial organization and labor economics. Journal of Econometrics, 200(2), 154-168. doi:10.1016/j.jeconom.2017.06.002
Hwang, Yujung (2021): Identification and Estimation of a Dynamic Discrete Choice Models with Endogenous Time-Varying Unobservable States Using Proxies. Working Paper. doi:10.2139/ssrn.3535098
Hwang, Yujung (2021): Bounding Omitted Variable Bias Using Auxiliary Data. Working Paper. doi:10.2139/ssrn.3866876

Examples

dat1 <- data.frame(proxy1=c(1,2,3),proxy2=c(2,3,4),proxy3=c(4,3,2))
## default minimum num of obs to run an EM algorithm is 10
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3)
## you can specify weights
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3,weights=c(0.1,0.5,0.4))


dat1 <- data.frame(proxy1=c(1,2,3),proxy2=c(2,3,4),proxy3=c(4,3,2))
## default minimum num of obs to run an EM algorithm is 10
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3)
## you can specify weights
dproxyme(dat=dat1,sbar=2,initvar=1,minobs=3,weights=c(0.1,0.5,0.4))

makeDummy

Description

This function is to make dummy variables using a discrete variable.

Usage

makeDummy(tZ)
makeDummy(tZ)

Arguments

`tZ`	An input vector

Value

Returns dZ, a matrix of size length(tZ)-by-card(tZ) :

The ij-th element in dZ is 1 if tZ[i] is equal to the j-th largest value of tZ. And the ij-th element in DZ is 0 otherwise. The row sum of dZ must be 1 by construction.

Author(s)

Yujung Hwang, [email protected]

Examples

makeDummy(c(1,2,3))

makeDummy(c(1,2,3))

weighted.cov

Description

This function is to compute an unbiased sample weighted covariance. The function uses only pairwise complete observations.

Usage

weighted.cov(x, y, w = NULL)
weighted.cov(x, y, w = NULL)

Arguments

`x`	An input vector to compute a covariance, cov(x,y)
`y`	An input vector to compute a covariance, cov(x,y)
`w`	A weight vector

Value

Returns an unbiased sample weighted covariance

Author(s)

Yujung Hwang, [email protected]

Examples

# If you do not specify weights, 
# it returns the usual unweighted sample covariance 
weighted.cov(x=c(1,3,5),y=c(2,3,1)) 

weighted.cov(x=c(1,3,5),y=c(2,3,1),w=c(0.1,0.5,0.4))

# If you do not specify weights, 
# it returns the usual unweighted sample covariance 
weighted.cov(x=c(1,3,5),y=c(2,3,1)) 

weighted.cov(x=c(1,3,5),y=c(2,3,1),w=c(0.1,0.5,0.4))

weighted.var

Description

This function is to compute an unbiased sample weighted variance.

Usage

weighted.var(x, w = NULL)
weighted.var(x, w = NULL)

Arguments

`x`	A vector to compute a variance, var(x)
`w`	A weight vector

Value

Returns an unbiased sample weighted variance

Author(s)

Yujung Hwang, [email protected]

Examples

## If you do not specify weights, 
## it returns the usual unweighted sample variance
weighted.var(x=c(1,3,5)) 

weighted.var(x=c(1,3,5),w=c(0.1,0.5,0.4))

## If you do not specify weights, 
## it returns the usual unweighted sample variance
weighted.var(x=c(1,3,5)) 

weighted.var(x=c(1,3,5),w=c(0.1,0.5,0.4))

Package 'factormodel'

Help Index

cproxyme

Description

Usage

Arguments

Value

Author(s)

References

Examples

dproxyme

Description

Usage

Arguments

Value

Author(s)

References

Examples

makeDummy

Description

Usage

Arguments

Value

Author(s)

Examples

weighted.cov

Description

Usage

Arguments

Value

Author(s)

Examples

weighted.var

Description

Usage

Arguments

Value

Author(s)

Examples