Using Transformers in SAS® Viya® to Predict Keywords From the Description of Movies and Series

1 Like

Text-based deep learning, driven by the growing emphasis on generative AI, has emerged as a pivotal point. In this context, transfer learning is crucial, harnessing the power of pre-trained transformer models like BERT. These NLP models are smaller and more cost-efficient than larger language models, making them deployable within existing customer infrastructure. We'll analyze a real business case within one of Telenor Norway's largest divisions.

The essence of the task is to read text, interpret it, and classify it accordingly. Unlike traditional classification problems demonstrated in SAS previously, this problem involves multi-class classification. Leveraging SAS Viya, we deploy the problem using 'DLPY,' which utilizes the 'BERT' model from the Python library. The model's performance surpasses that of humans, achieving an impressive F1 score of 80% for over 80% of the tagging classes (430). Furthermore, we compare the model's performance against other SAS models like 'Boolean rules.' To extract key insights, text preprocessing becomes necessary, and we demonstrate how SAS Viya's 'match codes' enable easy implementation. Additionally, we address various data quality issues, including duplicate entries and more. This paper serves as a blueprint for leveraging transformers and customizing layers. The model's transferability extends to any text-related use case and when combined with SAS Viya's inherent preprocessing capabilities.

Using Transformers in SAS® Viya® to Predict Keywords From the Description of Movies and Series

Registration is open