publications
conferences
2024
- Adapting Large Language Models for Underrepresented LanguagesEliseo Bao, Anxo Pérez, and Javier ParaparIn VII Congreso XoveTIC: impulsando el talento cientı́fico, 2024
journals
2024
- Explainable depression symptom detection in social mediaEliseo Bao, Anxo Pérez, and Javier ParaparHealth Information Science and Systems, Sep 2024
Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals’ health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these models are centred on achieving good classification results, ignoring the explainability and interpretability of the decisions. Recent research has pointed out the importance of using clinical markers, such as the use of symptoms, to improve trust in the computational models by health professionals. In this paper, we propose using transformer-based architectures to detect and explain the appearance of depressive symptom markers in the users’ writings. We present two approaches: i) train a model to classify, and another one to explain the classifier’s decision separately and ii) unify the two tasks simultaneously using a single model. Additionally, for this latter manner, we also investigated the performance of recent conversational LLMs when using in-context learning. Our natural language explanations enable clinicians to interpret the models’ decisions based on validated symptoms, enhancing trust in the automated process. We evaluate our approach using recent symptom-based datasets, employing both offline and expert-in-the-loop metrics to assess the quality of the explanations generated by our models. The experimental results show that it is possible to achieve good classification results while generating interpretable symptom-based explanations.
theses
2024
- M.Sc. ThesisMindWell: an open-source chatbot to assist in the detection and monitoring of depressive disordersEliseo BaoFeb 2024
Social media users often perceive these platforms as supportive spaces in which to expose, comment and disclose their daily problems, and thus their activity can provide clues to their mental health status. Research on Information Retrieval (IR), Natural Language Processing (NLP) and Machine Learning (ML) has recently used this online information to develop screening models that aim to identify at-risk individuals on platforms such as Twitter, Reddit or Facebook. Recently, research has highlighted the importance of using clinical markers, such as the use of validated symptoms, to improve health professionals’ confidence in computational models. This work presents a open source chatbot designed as an assistant that provides explanations, aligned with validated clinical markers, for the presence of depressive symptoms in social media posts, taking into account the temporality of these symptoms. The aim is to develop a tool that includes the necessary functionalities to provide the abovementioned benefits. Following this approach, it is possible to provide professionals with a support tool that relieves them of the tedious and time-consuming task of manually reviewing a subject’s posting history. We evaluated our proposal using expert knowledge to measure the quality and applicability of the chatbot’s explanations to the real clinical setting, and the results demonstrated the usefulness of the system for generating analyses based on the subject’s feelings.
2022
- B.Sc. ThesisRanking of Reddit users using Relevance Models for depressive disordersEliseo BaoJul 2022
Depressive disorders are one of the most common groups of illnesses in the world. Although it is true that effective treatments exist, either due to the lack of resources or the stigma that is still associated, in many cases the consequences for those suffering from this type of disorders are devastating. Knowing that the language manifested by people suffering from this type of diseases can denote evidence of their mental health, the aim of this project is to exploit the possibilities of Relevance-Based Language Models to be used for early detection. Specifically, taking CLEF eRisk collections as a starting point, the goal is to build depression vocabularies. These vocabularies identify terms of weight and relevance in people with depressive tendencies, and must undergo phases of evaluation and comparison with other validated lexicons. In addition, we focus in being able to perform ranking, i.e., from texts written by a number of people, to establish a ranking for them according to the possible degree of depression. For the management of the project, an agile methodology has been used, so that it has been possible to adapt the project according to the results obtained in the experimentation. Satisfactory results have been achieved, especially in terms of ranking, as well as new avenues for experimentation and expansion have been set.