Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials
- Autores
- Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria
- Año de publicación
- 2014
- Idioma
- inglés
- Tipo de recurso
- artículo
- Estado
- versión publicada
- Descripción
- To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.
Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados Unidos
Fil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados Unidos
Fil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados Unidos - Materia
-
AUGMENTED INVERSE PROBABILITY WEIGHTING
EFFICIENT ESTIMATION
EFFICIENT SAMPLING
MISSING DATA
SEMIPARAMETRIC MODEL
TWO-PHASE SAMPLING - Nivel de accesibilidad
- acceso abierto
- Condiciones de uso
- https://creativecommons.org/licenses/by-nc-sa/2.5/ar/
- Repositorio
- Institución
- Consejo Nacional de Investigaciones Científicas y Técnicas
- OAI Identificador
- oai:ri.conicet.gov.ar:11336/89356
Ver los metadatos del registro completo
id |
CONICETDig_0e7a591917f4e086850f7f703e9edaee |
---|---|
oai_identifier_str |
oai:ri.conicet.gov.ar:11336/89356 |
network_acronym_str |
CONICETDig |
repository_id_str |
3498 |
network_name_str |
CONICET Digital (CONICET) |
spelling |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trialsGilbert, Peter B.Yu, XuesongRotnitzky, Andrea GloriaAUGMENTED INVERSE PROBABILITY WEIGHTINGEFFICIENT ESTIMATIONEFFICIENT SAMPLINGMISSING DATASEMIPARAMETRIC MODELTWO-PHASE SAMPLINGhttps://purl.org/becyt/ford/1.1https://purl.org/becyt/ford/1To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method.Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados UnidosFil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados UnidosFil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados UnidosJohn Wiley & Sons Ltd2014-03info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionhttp://purl.org/coar/resource_type/c_6501info:ar-repo/semantics/articuloapplication/pdfapplication/pdfhttp://hdl.handle.net/11336/89356Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-9170277-6715CONICET DigitalCONICETenginfo:eu-repo/semantics/altIdentifier/doi/10.1002/sim.6006info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.6006info:eu-repo/semantics/openAccesshttps://creativecommons.org/licenses/by-nc-sa/2.5/ar/reponame:CONICET Digital (CONICET)instname:Consejo Nacional de Investigaciones Científicas y Técnicas2025-09-10T13:00:07Zoai:ri.conicet.gov.ar:11336/89356instacron:CONICETInstitucionalhttp://ri.conicet.gov.ar/Organismo científico-tecnológicoNo correspondehttp://ri.conicet.gov.ar/oai/requestdasensio@conicet.gov.ar; lcarlino@conicet.gov.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:34982025-09-10 13:00:07.926CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicasfalse |
dc.title.none.fl_str_mv |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
title |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
spellingShingle |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials Gilbert, Peter B. AUGMENTED INVERSE PROBABILITY WEIGHTING EFFICIENT ESTIMATION EFFICIENT SAMPLING MISSING DATA SEMIPARAMETRIC MODEL TWO-PHASE SAMPLING |
title_short |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
title_full |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
title_fullStr |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
title_full_unstemmed |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
title_sort |
Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials |
dc.creator.none.fl_str_mv |
Gilbert, Peter B. Yu, Xuesong Rotnitzky, Andrea Gloria |
author |
Gilbert, Peter B. |
author_facet |
Gilbert, Peter B. Yu, Xuesong Rotnitzky, Andrea Gloria |
author_role |
author |
author2 |
Yu, Xuesong Rotnitzky, Andrea Gloria |
author2_role |
author author |
dc.subject.none.fl_str_mv |
AUGMENTED INVERSE PROBABILITY WEIGHTING EFFICIENT ESTIMATION EFFICIENT SAMPLING MISSING DATA SEMIPARAMETRIC MODEL TWO-PHASE SAMPLING |
topic |
AUGMENTED INVERSE PROBABILITY WEIGHTING EFFICIENT ESTIMATION EFFICIENT SAMPLING MISSING DATA SEMIPARAMETRIC MODEL TWO-PHASE SAMPLING |
purl_subject.fl_str_mv |
https://purl.org/becyt/ford/1.1 https://purl.org/becyt/ford/1 |
dc.description.none.fl_txt_mv |
To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method. Fil: Gilbert, Peter B.. Fred Hutchinson Cancer Research Center; Estados Unidos. University of Washington; Estados Unidos Fil: Yu, Xuesong. Fred Hutchinson Cancer Research Center; Estados Unidos Fil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella; Argentina. Harvard University. Harvard School of Public Health; Estados Unidos |
description |
To address the objective in a clinical trial to estimate the mean or mean difference of an expensive endpoint Y, one approach employs a two-phase sampling design, wherein inexpensive auxiliary variables W predictive of Y are measured in everyone, Y is measured in a random sample, and the semiparametric efficient estimator is applied. This approach is made efficient by specifying the phase two selection probabilities as optimal functions of the auxiliary variables and measurement costs. While this approach is familiar to survey samplers, it apparently has seldom been used in clinical trials, and several novel results practicable for clinical trials are developed. We perform simulations to identify settings where the optimal approach significantly improves efficiency compared to approaches in current practice. We provide proofs and R code. The optimality results are developed to design an HIV vaccine trial, with objective to compare the mean 'importance-weighted' breadth (Y) of the T-cell response between randomized vaccine groups. The trial collects an auxiliary response (W) highly predictive of Y and measures Y in the optimal subset. We show that the optimal design-estimation approach can confer anywhere between absent and large efficiency gain (up to 24 % in the examples) compared to the approach with the same efficient estimator but simple random sampling, where greater variability in the cost-standardized conditional variance of Y given W yields greater efficiency gains. Accurate estimation of E[Y|W] is important for realizing the efficiency gain, which is aided by an ample phase two sample and by using a robust fitting method. |
publishDate |
2014 |
dc.date.none.fl_str_mv |
2014-03 |
dc.type.none.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion http://purl.org/coar/resource_type/c_6501 info:ar-repo/semantics/articulo |
format |
article |
status_str |
publishedVersion |
dc.identifier.none.fl_str_mv |
http://hdl.handle.net/11336/89356 Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-917 0277-6715 CONICET Digital CONICET |
url |
http://hdl.handle.net/11336/89356 |
identifier_str_mv |
Gilbert, Peter B.; Yu, Xuesong; Rotnitzky, Andrea Gloria; Optimal auxiliary-covariate-based two-phase sampling design for semiparametric efficient estimation of a mean or mean difference, with application to clinical trials; John Wiley & Sons Ltd; Statistics In Medicine; 33; 6; 3-2014; 901-917 0277-6715 CONICET Digital CONICET |
dc.language.none.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
info:eu-repo/semantics/altIdentifier/doi/10.1002/sim.6006 info:eu-repo/semantics/altIdentifier/url/https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.6006 |
dc.rights.none.fl_str_mv |
info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
eu_rights_str_mv |
openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-sa/2.5/ar/ |
dc.format.none.fl_str_mv |
application/pdf application/pdf |
dc.publisher.none.fl_str_mv |
John Wiley & Sons Ltd |
publisher.none.fl_str_mv |
John Wiley & Sons Ltd |
dc.source.none.fl_str_mv |
reponame:CONICET Digital (CONICET) instname:Consejo Nacional de Investigaciones Científicas y Técnicas |
reponame_str |
CONICET Digital (CONICET) |
collection |
CONICET Digital (CONICET) |
instname_str |
Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.name.fl_str_mv |
CONICET Digital (CONICET) - Consejo Nacional de Investigaciones Científicas y Técnicas |
repository.mail.fl_str_mv |
dasensio@conicet.gov.ar; lcarlino@conicet.gov.ar |
_version_ |
1842979860372258816 |
score |
12.993085 |