Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation

Autores: Bivort Haiek, Felipe; Ankolekar, Anupriya
Año de publicación: 2025
Idioma: inglés
Tipo de recurso: documento de conferencia
Estado: versión publicada
Descripción: Personalized news recommendation has become an essential tool for professionals around the world to keep track of news events matching their interests and alleviate information overload. Beyond personalization, an essential aspect of useful news recommendations for professional use is that they highlight events that are more significant and of higher impact. However, we find that state-of-the-art recommenders struggle to identify and recommend news about significant events. In this paper, we address this gap as follows. To mitigate the relative scarcity of news about significant events, we use an LLM to create a synthetic dataset of significant news seeded from business-relevant news in the MIND dataset. We train four state-of-the-art recommendation models (MINER, UNBERT, UniTRec, Fastformer) with synthetically enhanced versions of a subset of the MIND dataset. We find that this successfully improves the performance of two of the recommendation models on the MIND-large dataset restricted to news about significant events in terms of the MRR, NDCG@5 and Hit@5 metrics and the performance of UNBERT on the AUC metric. The contribution of this paper is three-fold: we highlight news significance as an important aspect of useful news recommendation, we demonstrate the use of generative LLMs to create synthetic datasets for training on rare data and lastly, we demonstrate that augmenting some recommendation models with more significant news improves news recommendation performance on the MIND dataset.
La recomendación personalizada de noticias se ha convertido en una herramienta esencial para que profesionales de todo el mundo puedan mantenerse al tanto de eventos noticiosos que se ajustan a sus intereses y, al mismo tiempo, reducir la sobrecarga de información. Más allá de la personalización, un aspecto clave de las recomendaciones de noticias útiles para uso profesional es que resalten eventos más significativos y de mayor impacto. Sin embargo, encontramos que los sistemas de recomendación más avanzados tienen dificultades para identificar y recomendar noticias sobre eventos significativos. En este artículo, abordamos esta limitación de la siguiente manera. Para mitigar la escasez relativa de noticias sobre eventos significativos, utilizamos un modelo de lenguaje grande (LLM) para crear un conjunto de datos sintético de noticias significativas a partir de noticias relevantes para el ámbito empresarial del conjunto de datos MIND. Entrenamos cuatro modelos de recomendación de ´ultima generación (MINER, UNBERT, Uni-TRec, Fastformer) con versiones mejoradas sintéticamente de un subconjunto del conjunto de datos MIND. Encontramos que esto mejora con éxito el rendimiento de dos de los modelos de recomendación en el conjunto de datos MIND-large, restringido a noticias sobre eventos significativos, en términos de las métricas MRR, NDCG@5 y Hit@5, así como el rendimiento de UNBERT en la métrica AUC. La contribución de este trabajo es triple: destacamos la importancia de la significancia de las noticias como un aspecto clave en la recomendación útil de noticias; demostramos el uso de modelos generativos LLM para crear conjuntos de datos sintéticos que permitan entrenar con datos poco frecuentes; y, finalmente, demostramos que al aumentar algunos modelos de recomendación con noticias más significativas se mejora el rendimiento de la recomendación en el conjunto de datos MIND.
Sociedad Argentina de Informática e Investigación Operativa
Materia: Ciencias Informáticas
Recommender Systems
LLMs
Augmentation
Sistemas de recomendación
Aumentación
Nivel de accesibilidad: acceso abierto
Condiciones de uso: http://creativecommons.org/licenses/by-nc-sa/4.0/
Repositorio
Institución: Universidad Nacional de La Plata
OAI Identificador: oai:sedici.unlp.edu.ar:10915/190557

Acceder

id	SEDICI_0f305a9423b0dc0fe897fe87d9d7be53
oai_identifier_str	oai:sedici.unlp.edu.ar:10915/190557
network_acronym_str	SEDICI
repository_id_str	1329
network_name_str	SEDICI (UNLP)
spelling	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based AugmentationMejorando la recomendación de noticias de negocios sobre eventos raros de alto impacto vía aumentación por LLMsBivort Haiek, FelipeAnkolekar, AnupriyaCiencias InformáticasRecommender SystemsLLMsAugmentationSistemas de recomendaciónAumentaciónPersonalized news recommendation has become an essential tool for professionals around the world to keep track of news events matching their interests and alleviate information overload. Beyond personalization, an essential aspect of useful news recommendations for professional use is that they highlight events that are more significant and of higher impact. However, we find that state-of-the-art recommenders struggle to identify and recommend news about significant events. In this paper, we address this gap as follows. To mitigate the relative scarcity of news about significant events, we use an LLM to create a synthetic dataset of significant news seeded from business-relevant news in the MIND dataset. We train four state-of-the-art recommendation models (MINER, UNBERT, UniTRec, Fastformer) with synthetically enhanced versions of a subset of the MIND dataset. We find that this successfully improves the performance of two of the recommendation models on the MIND-large dataset restricted to news about significant events in terms of the MRR, NDCG@5 and Hit@5 metrics and the performance of UNBERT on the AUC metric. The contribution of this paper is three-fold: we highlight news significance as an important aspect of useful news recommendation, we demonstrate the use of generative LLMs to create synthetic datasets for training on rare data and lastly, we demonstrate that augmenting some recommendation models with more significant news improves news recommendation performance on the MIND dataset.La recomendación personalizada de noticias se ha convertido en una herramienta esencial para que profesionales de todo el mundo puedan mantenerse al tanto de eventos noticiosos que se ajustan a sus intereses y, al mismo tiempo, reducir la sobrecarga de información. Más allá de la personalización, un aspecto clave de las recomendaciones de noticias útiles para uso profesional es que resalten eventos más significativos y de mayor impacto. Sin embargo, encontramos que los sistemas de recomendación más avanzados tienen dificultades para identificar y recomendar noticias sobre eventos significativos. En este artículo, abordamos esta limitación de la siguiente manera. Para mitigar la escasez relativa de noticias sobre eventos significativos, utilizamos un modelo de lenguaje grande (LLM) para crear un conjunto de datos sintético de noticias significativas a partir de noticias relevantes para el ámbito empresarial del conjunto de datos MIND. Entrenamos cuatro modelos de recomendación de ´ultima generación (MINER, UNBERT, Uni-TRec, Fastformer) con versiones mejoradas sintéticamente de un subconjunto del conjunto de datos MIND. Encontramos que esto mejora con éxito el rendimiento de dos de los modelos de recomendación en el conjunto de datos MIND-large, restringido a noticias sobre eventos significativos, en términos de las métricas MRR, NDCG@5 y Hit@5, así como el rendimiento de UNBERT en la métrica AUC. La contribución de este trabajo es triple: destacamos la importancia de la significancia de las noticias como un aspecto clave en la recomendación útil de noticias; demostramos el uso de modelos generativos LLM para crear conjuntos de datos sintéticos que permitan entrenar con datos poco frecuentes; y, finalmente, demostramos que al aumentar algunos modelos de recomendación con noticias más significativas se mejora el rendimiento de la recomendación en el conjunto de datos MIND.Sociedad Argentina de Informática e Investigación Operativa2025-08info:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/publishedVersionObjeto de conferenciahttp://purl.org/coar/resource_type/c_5794info:ar-repo/semantics/documentoDeConferenciaapplication/pdf156-170http://sedici.unlp.edu.ar/handle/10915/190557enginfo:eu-repo/semantics/altIdentifier/url/https://revistas.unlp.edu.ar/JAIIO/article/view/19779info:eu-repo/semantics/altIdentifier/issn/2451-7496info:eu-repo/semantics/openAccesshttp://creativecommons.org/licenses/by-nc-sa/4.0/Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)reponame:SEDICI (UNLP)instname:Universidad Nacional de La Platainstacron:UNLP2026-05-27T11:46:30Zoai:sedici.unlp.edu.ar:10915/190557Institucionalhttp://sedici.unlp.edu.ar/Universidad públicaNo correspondehttp://sedici.unlp.edu.ar/oai/snrdalira@sedici.unlp.edu.arArgentinaNo correspondeNo correspondeNo correspondeopendoar:13292026-05-27 11:46:31.124SEDICI (UNLP) - Universidad Nacional de La Platafalse
dc.title.none.fl_str_mv	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation Mejorando la recomendación de noticias de negocios sobre eventos raros de alto impacto vía aumentación por LLMs
title	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
spellingShingle	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation Bivort Haiek, Felipe Ciencias Informáticas Recommender Systems LLMs Augmentation Sistemas de recomendación Aumentación
title_short	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
title_full	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
title_fullStr	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
title_full_unstemmed	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
title_sort	Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation
dc.creator.none.fl_str_mv	Bivort Haiek, Felipe Ankolekar, Anupriya
author	Bivort Haiek, Felipe
author_facet	Bivort Haiek, Felipe Ankolekar, Anupriya
author_role	author
author2	Ankolekar, Anupriya
author2_role	author
dc.subject.none.fl_str_mv	Ciencias Informáticas Recommender Systems LLMs Augmentation Sistemas de recomendación Aumentación
topic	Ciencias Informáticas Recommender Systems LLMs Augmentation Sistemas de recomendación Aumentación
dc.description.none.fl_txt_mv	Personalized news recommendation has become an essential tool for professionals around the world to keep track of news events matching their interests and alleviate information overload. Beyond personalization, an essential aspect of useful news recommendations for professional use is that they highlight events that are more significant and of higher impact. However, we find that state-of-the-art recommenders struggle to identify and recommend news about significant events. In this paper, we address this gap as follows. To mitigate the relative scarcity of news about significant events, we use an LLM to create a synthetic dataset of significant news seeded from business-relevant news in the MIND dataset. We train four state-of-the-art recommendation models (MINER, UNBERT, UniTRec, Fastformer) with synthetically enhanced versions of a subset of the MIND dataset. We find that this successfully improves the performance of two of the recommendation models on the MIND-large dataset restricted to news about significant events in terms of the MRR, NDCG@5 and Hit@5 metrics and the performance of UNBERT on the AUC metric. The contribution of this paper is three-fold: we highlight news significance as an important aspect of useful news recommendation, we demonstrate the use of generative LLMs to create synthetic datasets for training on rare data and lastly, we demonstrate that augmenting some recommendation models with more significant news improves news recommendation performance on the MIND dataset. La recomendación personalizada de noticias se ha convertido en una herramienta esencial para que profesionales de todo el mundo puedan mantenerse al tanto de eventos noticiosos que se ajustan a sus intereses y, al mismo tiempo, reducir la sobrecarga de información. Más allá de la personalización, un aspecto clave de las recomendaciones de noticias útiles para uso profesional es que resalten eventos más significativos y de mayor impacto. Sin embargo, encontramos que los sistemas de recomendación más avanzados tienen dificultades para identificar y recomendar noticias sobre eventos significativos. En este artículo, abordamos esta limitación de la siguiente manera. Para mitigar la escasez relativa de noticias sobre eventos significativos, utilizamos un modelo de lenguaje grande (LLM) para crear un conjunto de datos sintético de noticias significativas a partir de noticias relevantes para el ámbito empresarial del conjunto de datos MIND. Entrenamos cuatro modelos de recomendación de ´ultima generación (MINER, UNBERT, Uni-TRec, Fastformer) con versiones mejoradas sintéticamente de un subconjunto del conjunto de datos MIND. Encontramos que esto mejora con éxito el rendimiento de dos de los modelos de recomendación en el conjunto de datos MIND-large, restringido a noticias sobre eventos significativos, en términos de las métricas MRR, NDCG@5 y Hit@5, así como el rendimiento de UNBERT en la métrica AUC. La contribución de este trabajo es triple: destacamos la importancia de la significancia de las noticias como un aspecto clave en la recomendación útil de noticias; demostramos el uso de modelos generativos LLM para crear conjuntos de datos sintéticos que permitan entrenar con datos poco frecuentes; y, finalmente, demostramos que al aumentar algunos modelos de recomendación con noticias más significativas se mejora el rendimiento de la recomendación en el conjunto de datos MIND. Sociedad Argentina de Informática e Investigación Operativa
description	Personalized news recommendation has become an essential tool for professionals around the world to keep track of news events matching their interests and alleviate information overload. Beyond personalization, an essential aspect of useful news recommendations for professional use is that they highlight events that are more significant and of higher impact. However, we find that state-of-the-art recommenders struggle to identify and recommend news about significant events. In this paper, we address this gap as follows. To mitigate the relative scarcity of news about significant events, we use an LLM to create a synthetic dataset of significant news seeded from business-relevant news in the MIND dataset. We train four state-of-the-art recommendation models (MINER, UNBERT, UniTRec, Fastformer) with synthetically enhanced versions of a subset of the MIND dataset. We find that this successfully improves the performance of two of the recommendation models on the MIND-large dataset restricted to news about significant events in terms of the MRR, NDCG@5 and Hit@5 metrics and the performance of UNBERT on the AUC metric. The contribution of this paper is three-fold: we highlight news significance as an important aspect of useful news recommendation, we demonstrate the use of generative LLMs to create synthetic datasets for training on rare data and lastly, we demonstrate that augmenting some recommendation models with more significant news improves news recommendation performance on the MIND dataset.
publishDate	2025
dc.date.none.fl_str_mv	2025-08
dc.type.none.fl_str_mv	info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion Objeto de conferencia http://purl.org/coar/resource_type/c_5794 info:ar-repo/semantics/documentoDeConferencia
format	conferenceObject
status_str	publishedVersion
dc.identifier.none.fl_str_mv	http://sedici.unlp.edu.ar/handle/10915/190557
url	http://sedici.unlp.edu.ar/handle/10915/190557
dc.language.none.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	info:eu-repo/semantics/altIdentifier/url/https://revistas.unlp.edu.ar/JAIIO/article/view/19779 info:eu-repo/semantics/altIdentifier/issn/2451-7496
dc.rights.none.fl_str_mv	info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
eu_rights_str_mv	openAccess
rights_invalid_str_mv	http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.format.none.fl_str_mv	application/pdf 156-170
dc.source.none.fl_str_mv	reponame:SEDICI (UNLP) instname:Universidad Nacional de La Plata instacron:UNLP
reponame_str	SEDICI (UNLP)
collection	SEDICI (UNLP)
instname_str	Universidad Nacional de La Plata
instacron_str	UNLP
institution	UNLP
repository.name.fl_str_mv	SEDICI (UNLP) - Universidad Nacional de La Plata
repository.mail.fl_str_mv	alira@sedici.unlp.edu.ar
_version_	1866372189125607424
score	13.040872

Enhancing The Recommendation of High-impact Rare-event Business News for Professionals with LLM-based Augmentation

Publicaciones similares