About thistleandtweed

thistleandtweed · ‎08-14-2024

Hello experts. I am trying to import the weights of RoBERTa from huggingface into SAS viya, as importing it via sasctl might take up a lot of time and space. I was thinking of registering the prompts within sasctl, and then creating SAS score code that pulls RoBERTa from huggingface directly from an API that I would host in python. Would it be possible to do the above, and then register RoBERTa as a model in model manager and/or use it in model studio as part of my pipeline? Has anyone done this before, please advise. Thank you!

thistleandtweed · ‎07-15-2024

hi harry, thank you for the comprehensive reply. however, i am still facing the same issue as before where I am getting an unable to connect to <cas-host> at <port-number>. i tried contacting the person in charge of my account, but they are not able to help much. is there any other way I can import the csv file into SAS VA programmatically without having to use swat? i tried sasctl but i believe that is more meant for model manager

thistleandtweed · ‎07-09-2024

Hi Experts, I currently have a Python ML classification model that can run as a REST API. The idea was to take the labelled output dataset from the ML model and then send it to SAS Visual Analytics for visualization and further data exploration for the next phase of the project. However, most of the information online about how to do something similar to this is quite outdated (Python 3.6 ish) and I'm quite lost on how I could enact something similar to the above? Could anyone provide guidance on how I can submit the data to SAS Visual Analytics or share any relevant links or documentation that could assist me in accomplishing this task? Thank you

thistleandtweed · ‎06-20-2024

Hi! I have managed to come up with a model pipeline in python for clustering of text using DBSCAN and I wish to import this model into SAS model manager through SAS CTL for further analysis of the clusters using SAS Topic Modelling methods (can't use OS for this step) Most of the examples I see online use SASCTL to import supervised learning models. However, my output in this case is a bunch of cluster labels, and the number of clusters widely depends on every run of the model. So far, i have converted my DBSCAN model into a pickle file, and i have created a score file for my DBSCAN model (which is basically using silhouette score and Davie Bouldin score to evaluate the efficiency of the clustering). I have also created Json files for my input variables and output variables. Right now, I'm getting stuck at the part where I have to write my model properties into a JSON file and the metadata information, as my target values aren't binary. Does anyone have any examples/implementation of people importing unsupervised clustering models into SAS Model Manager with SASCTL? It would be really helpful if someone could guide me on how to move on from this step. Thank you in advance!! EDIT: I managed to import my code (score code + json files + pickle file) into model manager, but I get this error message for my score code "The score code for the model could not be found. Details: The score code wrapper could not be generated for the model because the Python source code is not in the correct format." this is my score code: %%writefile ./Python_DBSCAN/DBSCAN_score.py import numpy import pandas as pd import pickle import settings import spacy from sklearn.decomposition import TruncatedSVD from sklearn.feature_extraction.text import TfidfVectorizer import umap.umap_ as umap from sklearn.neighbors import NearestNeighbors from matplotlib import pyplot as plt from kneed import KneeLocator from sklearn.metrics import davies_bouldin_score from sklearn.cluster import DBSCAN from sklearn.preprocessing import StandardScaler from sklearn import metrics def computeScore(<my input variables, but im only using one of them for the clustering>😞 try: _thisModelFit except NameError: with open(settings.pickle_path + "/Python_DBSCAN.pickle", 'rb') as _pFile: _thisModelFit = pickle.load(_pFile) input_list = [[<list of my input variables>]] input_df = pd.DataFrame(input_list, columns=[<my input variables>]) # make pred proba = metrics.davies_bouldin_score(_thisModelFit.X_scale, _thisModelFit.labels_) return proba

thistleandtweed · ‎06-11-2024

hi! I wanted to ask if anyone has managed to perform silhouette analysis in SAS for clustering results before? I had to group emails together based on some predefined conditions on SAS code, and I did that by conducting dimensionality reduction with LDA and then clustering the topic vectors together with Kmeans clustering I wish to evaluate the accuracy of the clustering through silhouette analysis so that I can tweak any hyperparameters if needed. However, I can't find much SAS code online about the implementation of it in SAS. Was wondering if anyone had any links or possible guidance on how I can go about tackling it? Thanks in advance!

thistleandtweed · ‎06-06-2024

ok, thank you for the reply!

thistleandtweed · ‎06-04-2024

Hello, I'm dealing with unstructured text data, and I need to conduct unsupervised multi-class classification of it. I managed to create a term-by-document matrix of my corpus by using PROC TEXTMINE using single-value decomposition (SVD). My approach to classifying this data is to conduct K-Means clustering and then analyse the clusters to segregate the text into pre-defined topics automatically. However, after viewing the score table of PROC FASTCLUS, I am a bit lost as to how to continue with my evaluation of the results. For reference, this is how my summary table for PROC FASTCLUS looks like right now. Do let me know if I have to provide more information about my table. Sorry if its a basic question, and thanks in advance!

thistleandtweed · ‎05-30-2024

Hi everyone! I'm currently learning SAS programming, and I wanted to embark on my own project for now. I have access to SAS Viya, so I was thinking of conducting unsupervised classification of emails (multi-class classification) through VDMML and VTA. I was thinking of running the text through VTA and then extracting the score code from the categories node, and then process this data to use in VDMML to train a classification model. However, I'm not sure what kind of pipeline would be suitable for this approach as most of the current pipelines seem catered towards supervised learning. Any help in this area would be appreciated. Apologies if this is a very basic question, and thank

Online Status	Offline
Date Last Visited	‎10-08-2024 05:26 AM

Transferring RoBERTa model weights into SAS Model Manager or Model Stu...

Re: Calling Python REST API to import data into SAS Visual Analytics

Calling Python REST API to import data into SAS Visual Analytics

Importing Python Clustering Models into SAS through SASCTL and Model M...

Silhouette Analysis in SAS for unsupervised learning

Re: Examining results of PROC FASTCLUS

Examining results of PROC FASTCLUS

Classifying emails with SAS VTA and VDMML

Examining results of PROC FASTCLUS

Classifying emails with SAS VTA and VDMML

Transferring RoBERTa model weights into SAS Model Manager or Model Stu...

Re: Calling Python REST API to import data into SAS Visual Analytics

Calling Python REST API to import data into SAS Visual Analytics

Importing Python Clustering Models into SAS through SASCTL and Model M...

Silhouette Analysis in SAS for unsupervised learning

Re: Examining results of PROC FASTCLUS

Examining results of PROC FASTCLUS

Classifying emails with SAS VTA and VDMML