Binary regression model with misclassification and Berkson-type measurement error with student-t distribution
DOI:
https://doi.org/10.21754/iecos.v24i2.2003Keywords:
Binary regression model, Berkson-type error, misclassification, Student-t distributionAbstract
In this article, we introduce a regression model tailored for fitting binary data affected by misclassification in the response variable and Berkson-type measurement error in the covariate. The conventional assumption of a normal distribution for measurement error may inadequately represent atypical observations present in the dataset. To address this limitation, our model incorporates misclassification in the response variable and Berksontype measurement error, employing the Student-t distribution for more robust modeling of these atypical observations. We utilize the cumulative distribution function from the Student-t distribution as the link function, enhancing our ability to capture the dataset’s unique characteristics. Model parameters are estimated via the maximum likelihood method. We conduct a comprehensive Monte Carlo simulation study to thoroughly assess the impact of measurement errors and misclassification. Additionally, we apply the proposed model to a real-world dataset of survivors from the atomic bombing in Japan, showcasing its adaptability and suitability in practical scenarios. Our findings highlight the robustness and flexibility of this model in effectively handling complex binary regression scenarios involving measurement errors and misclassification.
Downloads
References
Bazán, J. L., Romeo, J. S., & Rodrigues, J. (2014). Bayesian skew-probit regression for binary response data. Brazilian Journal of Probability and Statistics, 28(4), 467-482. https://doi.org/10.1214/13-BJPS218
Bolfarine, H., & Lachos, V. H. (2006). Skew binary regression with measurement errors. Statistics, 40(6), 485-494. https://doi.org/10.1080/02331880600589270
Branco, M. D., & Dey, D. K. (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79(1), 99-113. https://doi.org/10.1006/jmva.2000.1960
Burr, D. (1988). On errors-in-variables in binary regression—Berkson case. Journal of the American Statistical Association, 83(403), 739-743. https://doi.org/10.1080/01621459.1988.10478656
Carroll, R. J., Spiegelman, C. H., Lan, K. G., Bailey, K. T., & Abbott, R. D. (1984). On errors-in-variables for binary regression models. Biometrika, 71(1), 19-25. https://doi.org/10.1093/biomet/71.1.19
Carroll, R. J., Ruppert, D., Stefanski, L. A., & Crainiceanu, C. M. (2006). Measurement error in nonlinear models: a modern perspective. Chapman and Hall/CRC. https://doi.org/10.1201/9781420010138
de Andrade Moral, R., Hinde, J., & Garcia Borges Demétrio, C. (2017). Half-normal plots and overdispersed models in R: the hnp package. Journal of Statistical Software, 81(10). https://doi.org/10.18637/jss.v081.i10
Dunn, P. K., & Smyth, G. K. (1996). Randomized quantile residuals. Journal of Computational and graphical statistics, 5(3), 236-244.
https://doi.org/10.1080/10618600.1996.10474708
Ekholm, A., & Palmgren, J. (1982). A model for a binary response with misclassifications. In GLIM 82: Proceedings of the international conference on generalised linear models (pp. 128-143). Springer New York. https://doi.org/10.1007/978-1-4612-5771-4_13
Kannel, W. B., & Gordon, T. (1968). The Framingham Study: an epidemiological investigation of cardiovascular disease. United States. Department of Health, Education, and Welfare, National Institutes of Health.
Lange, K. L., Little, R. J., & Taylor, J. M. (1989). Robust statistical modeling using the t distribution. Journal of the American Statistical Association, 84(408), 881-896. https://doi.org/10.1080/01621459.1989.10478852
Lin, P. E. (1972). Some characterizations of the multivariate t distribution. Journal of Multivariate Analysis, 2(3), 339-344.
https://doi.org/10.1016/0047-259X(72)90021-8
Liu, H., & Zhang, Z. (2017). Logistic regression with misclassification in binary outcome variables: a method and software. Behaviormetrika, 44(2), 447-476. https://doi.org/10.1007/s41237-017-0031-y
Nash, J. C., & Varadhan, R. (2011). Unifying optimization algorithms to aid software system users: optimx for R. Journal of Statistical Software, 43, 1-14. https://doi.org/10.18637/jss.v043.i09
Pereira, M. A. A., & Russo, C. M. (2019). Nonlinear mixed-effects models with scale mixture of skew-normal distributions. Journal of Applied Statistics, 46(9), 1602-1620. https://doi.org/10.1080/02664763.2018.1557122
R Core Team, R. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Roy, S., Banerjee, T., & Maiti, T. (2005). Measurement error model for misclassified binary responses. Statistics in medicine, 24(2), 269-283.
https://doi.org/10.1002/sim.1886
Sposto, R., Preston, D. L., Shimizu, Y., & Mabuchi, K. (1992). The effect of diagnostic misclassification on non-cancer and cancer mortality dose response in A-bomb survivors. Biometrics, 48(2), 605-617. https://www.jstor.org/stable/2532315
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Marcos Antonio Alves Pereira, Betsabé Grimalda Blas Achic
This work is licensed under a Creative Commons Attribution 4.0 International License.
CC BY 4.0 DEED Attribution 4.0 International