Inference of Socioeconomic Status in a Communication Graph

Autores
Fixman, Martín; Berenstein, Ariel; Brea, Jorge; Minnoni, Martín; Sarraute, Carlos
Año de publicación
2016
Idioma
inglés
Tipo de recurso
documento de conferencia
Estado
versión publicada
Descripción
In this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico. We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
Materia
Ciencias Informáticas
mobile phone network
socio-economic correlations
Nivel de accesibilidad
acceso abierto
Condiciones de uso
http://creativecommons.org/licenses/by-sa/3.0/
Repositorio
SEDICI (UNLP)
Institución
Universidad Nacional de La Plata
OAI Identificador
oai:sedici.unlp.edu.ar:10915/56824

id SEDICI_cfce7e38e8b6ce4c6882c07e4b3c2eb1
oai_identifier_str oai:sedici.unlp.edu.ar:10915/56824
network_acronym_str SEDICI
repository_id_str 1329
network_name_str SEDICI (UNLP)
spelling Inference of Socioeconomic Status in a Communication GraphFixman, MartínBerenstein, ArielBrea, JorgeMinnoni, MartínSarraute, CarlosCiencias Informáticasmobile phone networksocio-economic correlationsIn this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico. We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.Sociedad Argentina de Informática e Investigación Operativa (SADIO)2016-09info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf95-106http://sedici.unlp.edu.ar/handle/10915/56824enginfo:eu-repo/semantics/altIdentifier/url/http://45jaiio.sadio.org.ar/sites/default/files/AGRANDA-09.pdfinfo:eu-repo/semantics/altIdentifier/issn/2451-7569info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-sa/3.0/Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2025-09-03T10:38:50Zoai:sedici.unlp.edu.ar:10915/56824Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292025-09-03 10:38:50.579SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv Inference of Socioeconomic Status in a Communication Graph
title Inference of Socioeconomic Status in a Communication Graph
spellingShingle Inference of Socioeconomic Status in a Communication Graph
Fixman, Martín
Ciencias Informáticas
mobile phone network
socio-economic correlations
title_short Inference of Socioeconomic Status in a Communication Graph
title_full Inference of Socioeconomic Status in a Communication Graph
title_fullStr Inference of Socioeconomic Status in a Communication Graph
title_full_unstemmed Inference of Socioeconomic Status in a Communication Graph
title_sort Inference of Socioeconomic Status in a Communication Graph
dc.creator.none.fl_str_mv Fixman, Martín
Berenstein, Ariel
Brea, Jorge
Minnoni, Martín
Sarraute, Carlos
author Fixman, Martín
author_facet Fixman, Martín
Berenstein, Ariel
Brea, Jorge
Minnoni, Martín
Sarraute, Carlos
author_role author
author2 Berenstein, Ariel
Brea, Jorge
Minnoni, Martín
Sarraute, Carlos
author2_role author
author
author
author
dc.subject.none.fl_str_mv Ciencias Informáticas
mobile phone network
socio-economic correlations
topic Ciencias Informáticas
mobile phone network
socio-economic correlations
dc.description.none.fl_txt_mv In this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico. We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.
Sociedad Argentina de Informática e Investigación Operativa (SADIO)
description In this work, we examine the socio-economic correlations present among users in a mobile phone network in Mexico. First, we find that the distribution of income for a subset of users –for which we have income information given by a large bank in Mexico– follows closely, but not exactly, the income distribution for the whole population of Mexico. We also show the existence of a strong socio-economic homophily in the mobile phone network, where users linked in the network are more likely to have similar income. The main contribution of this work is that we leverage this homophily in order to propose a methodology, based on Bayesian statistics, to infer the socio-economic status for a large subset of users in the network (for which we have no banking information). With our proposed algorithm, we achieve an accuracy of 0.71 in a two-class classification problem (low and high income) which significantly outperforms a simpler method based on a frequentist approach. Finally, we extend the two-class classification problem to multiple classes by using the Dirichlet distribution.
publishDate 2016
dc.date.none.fl_str_mv 2016-09
dc.type.none.fl_str_mv info:eu-repo/semantics/conferenceObject
info:eu-repo/semantics/publishedVersion
Objeto de conferencia
http://purl.org/coar/resource_type/c_5794
info:ar-repo/semantics/documentoDeConferencia
format conferenceObject
status_str publishedVersion
dc.identifier.none.fl_str_mv http://sedici.unlp.edu.ar/handle/10915/56824
url http://sedici.unlp.edu.ar/handle/10915/56824
dc.language.none.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv info:eu-repo/semantics/altIdentifier/url/http://45jaiio.sadio.org.ar/sites/default/files/AGRANDA-09.pdf
info:eu-repo/semantics/altIdentifier/issn/2451-7569
dc.rights.none.fl_str_mv info:eu-repo/semantics/openAccess
http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
eu_rights_str_mv openAccess
rights_invalid_str_mv http://creativecommons.org/licenses/by-sa/3.0/
Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
dc.format.none.fl_str_mv application/pdf
95-106
dc.source.none.fl_str_mv reponame:SEDICI (UNLP)
instname:Universidad Nacional de La Plata
instacron:UNLP
reponame_str SEDICI (UNLP)
collection SEDICI (UNLP)
instname_str Universidad Nacional de La Plata
instacron_str UNLP
institution UNLP
repository.name.fl_str_mv SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv alira@sedici.unlp.edu.ar
_version_ 1842260248202575872
score 13.13397