Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials

Autores
Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria
Año de publicación
2014
Idioma
inglés
Tipo de recurso
artículo
Estado
versión publicada
Descripción
To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.
Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados Unidos
Fil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados Unidos
Fil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados Unidos
Materia
AUGMENTED INVERSE PROBABILITY WEIGHTING
EFFICIENT ESTIMATION
EFFICIENT SAMPLING
MISSING DATA
SEMIPARAMETRIC MODEL
TWO-PHASE SAMPLING
Nivel de accesibilidad
acceso abierto
Condiciones de uso
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
Repositorio
CONICET Digital (CONICET)
Institución
Consejo Nacional de Investigaciones Científicas y Técnicas
OAI Identificador
oai:ri.conicet.gov.ar:11336/89356

id CONICETDig_0e7a591917f4e086850f7f703e9edaee
oai_identifier_str oai:ri.conicet.gov.ar:11336/89356
network_acronym_str CONICETDig
repository_id_str 3498
network_name_str CONICET Digital (CONICET)
spelling Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trialsGilbert, Peter B.Yu, XuesongRotnitzky, Andrea GloriaAUGMENTED INVERSE PROBABILITY WEIGHTINGEFFICIENT ESTIMATIONEFFICIENT SAMPLINGMISSING DATASEMIPARAMETRIC MODELTWO-PHASE SAMPLINGhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados UnidosFil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados UnidosFil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados UnidosJohn Wiley & Sons Ltd2014-03info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/89356Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-9170277-6715CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1002/sim.6006info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.6006info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-10T13:00:07Zoai:ri.conicet.gov.ar:11336/89356instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-10 13:00:07.926CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse
dc.title.none.fl_str_mv Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
title Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
spellingShingle Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
Gilbert, Peter B.
AUGMENTED INVERSE PROBABILITY WEIGHTING
EFFICIENT ESTIMATION
EFFICIENT SAMPLING
MISSING DATA
SEMIPARAMETRIC MODEL
TWO-PHASE SAMPLING
title_short Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
title_full Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
title_fullStr Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
title_full_unstemmed Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
title_sort Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
dc.creator.none.fl_str_mv Gilbert, Peter B.
Yu, Xuesong
Rotnitzky, Andrea Gloria
author Gilbert, Peter B.
author_facet Gilbert, Peter B.
Yu, Xuesong
Rotnitzky, Andrea Gloria
author_role author
author2 Yu, Xuesong
Rotnitzky, Andrea Gloria
author2_role author
author
dc.subject.none.fl_str_mv AUGMENTED INVERSE PROBABILITY WEIGHTING
EFFICIENT ESTIMATION
EFFICIENT SAMPLING
MISSING DATA
SEMIPARAMETRIC MODEL
TWO-PHASE SAMPLING
topic AUGMENTED INVERSE PROBABILITY WEIGHTING
EFFICIENT ESTIMATION
EFFICIENT SAMPLING
MISSING DATA
SEMIPARAMETRIC MODEL
TWO-PHASE SAMPLING
purl_subject.fl_str_mv https://purl.org/becyt/ford/1.1
https://purl.org/becyt/ford/1
dc.description.none.fl_txt_mv To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.
Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados Unidos
Fil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados Unidos
Fil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados Unidos
description To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.
publishDate 2014
dc.date.none.fl_str_mv 2014-03
dc.type.none.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
http://purl.org/coar/resource_type/c_6501
info:ar-repo/semantics/articulo
format article
status_str publishedVersion
dc.identifier.none.fl_str_mv http://hdl.handle.net/11336/89356
Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-917
0277-6715
CONICET Digital
CONICET
url http://hdl.handle.net/11336/89356
identifier_str_mv Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-917
0277-6715
CONICET Digital
CONICET
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/doi/10.1002/sim.6006
info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.6006
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
eu_rights_str_mv openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv John Wiley & Sons Ltd
publisher.none.fl_str_mv John Wiley & Sons Ltd
dc.source.none.fl_str_mv reponame:CONICET Digital (CONICET)
instname:Consejo Nacional de Investigaciones Científicas y Técnicas
reponame_str CONICET Digital (CONICET)
collection CONICET Digital (CONICET)
instname_str Consejo Nacional de Investigaciones Científicas y Técnicas
repository.name.fl_str_mv CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas
repository.mail.fl_str_mv dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar
_version_ 1842979860372258816
score 12.993085