Comparative Analysis of Transformer-Based Language Models for Text Analysis in the Domain of Sustainable Development

dc.contributor.advisorMarina G Erechtchoukova
dc.contributor.authorNabil Safwat
dc.date.accessioned2023-08-04T15:16:34Z
dc.date.available2023-08-04T15:16:34Z
dc.date.issued2023-08-04
dc.date.updated2023-08-04T15:16:33Z
dc.degree.disciplineInformation Systems and Technology
dc.degree.levelMaster's
dc.degree.nameMA - Master of Arts
dc.description.abstractWith advancements of Artificial Intelligence, Natural Language Processing (NLP) has gained a lot of attention because of its potential to facilitate complex human-machine interactions, enhance language-based applications, and automate processing of unstructured texts. The study investigates the transfer learning approach on Transformer-based Language models, abstractive text summarization approach, and their application to the domain of Sustainable Development with the goal to determine SDGs representation in scientific publications using the text summarization technique. To achieve this, the traditional transfer learning framework was expanded so that: (1) the relevance of textual documents to specified text can be evaluated, (2) neural language models, namely BART and T5, were selected, and (3) 8 text similarity measures were investigated to identify the most informative ones. Both the BART and T5 models were fine-tuned on an acquired domain-specific corpus of scientific publications extracted from Scopus Elsevier database. The relevance of recently published works to an SDG was determined by calculating semantic similarity scores between each model generated summary to the SDG’s description. The proposed framework made it possible to identify goals that dominated the developed corpus and those that require further attention of the research community.
dc.identifier.urihttps://hdl.handle.net/10315/41364
dc.languageen
dc.rightsAuthor owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subjectInformation science
dc.subjectArtificial intelligence
dc.subjectEnvironmental studies
dc.subject.keywordstransformer-based language models
dc.subject.keywordstransfer learning
dc.subject.keywordssemantic similarity
dc.subject.keywordsabstractive text summarization
dc.subject.keywordssustainable development
dc.subject.keywordsdocument relevance
dc.titleComparative Analysis of Transformer-Based Language Models for Text Analysis in the Domain of Sustainable Development
dc.typeElectronic Thesis or Dissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Safwat_Nabil_2023_Masters.pdf
Size:
4.83 MB
Format:
Adobe Portable Document Format