Training PyTorch deep learning models in Viya & dealing with multi dimensional tabular data - Part I

This is the first installment of a series focused on training custom PyTorch models in Viya. In this part, I will discuss how to train a custom PyTorch model, while the second part will cover the challenges of using multi-dimensional tabular data for these models.

Let me first introduce the use case. My area of focus is showcasing the value of Viya in the healthcare and life sciences industry, especially in the areas of deep learning. To that end I created a proof-of-concept that showcases Viya's flexibility in training any PyTorch deep learning model in Viya. The POC centers on building a predictive model to diagnose the presence of a certain kind of infection in human cell samples. The cellular data is high-dimensional and obtained through advanced techniques such as CyTOF (Cytometry by Time-of-Flight). This process generates proteomics data, also referred to as protein signature data, a subtype of omics data that provides highly detailed cellular-level information.

Without delving into the technicalities of CyTOF, it allows us to generate protein signature data for human cell samples. For this use case, we downloaded data from immport* for approximately 450 patients, each with cell populations ranging from 50,000 to 500,000 cells. For each patient an indicator of the patient being infected or not is also provided. To capture the complex relationships between cells for each patient to predict presence of the infection, we require advanced modeling techniques. In our case, we are utilizing models based on the Set Transformer architecture, which inherently supports permutation invariance. This simply means that the model's predictions are unaffected by the order in which the cell population is presented for a given patient.

The Set Transformer architecture is not natively supported in Viya, which is where the ability to train custom PyTorch models becomes crucial. With dlmodelzoo action sets in Viya, you can import and train any PyTorch-based model with minimal adjustments. Once trained, the model can also be deployed in SAS Model Studio. As stated above, this part will focus on the process of training the custom PyTorch model in Viya.

Training custom PyTorch models involves two key steps: coding the model in Python to make it torchscript-compatible and then wrapping it in a class before torchscripting it using modules like torch.jit.script and torch.jit.trace. Once the model is torchscripted and saved as a .pt file, you can use the dlmodelzoo action sets to train, score, and export the model as an Astore file.

Let’s take a moment to understand torchscripting. In simple terms, torchscripting compiles the model so that it becomes independent of the environment on which it is trained. There are specific guidelines to follow when coding a model for torchscripting. For example, model classes must inherit only from torch.nn.Module (custom class inheritance is not supported, though there are some workarounds). Additionally, all variables in the code must have static types, explicitly annotated, as torchscript assumes a default type of torch.Tensor, which might not be suitable for every case. You can refer to this link for further examples and details.

I will walk through this process using the Set Transformer model as the example for creating the model. For the sake of brevity, I am not providing code for the individual components of the architecture (like ISAB and PMA). You can get the code for those components in the official github repo for the Set Transformers architecture. The main model definition is as follows. Note that the model is a variant of a Set Transformer model defined in the repo.

class SetTransformer(torch.nn.Module):
    def __init__(self, dim_input, num_outputs, dim_output,
            num_inds=32, dim_hidden=128, num_heads=4, ln=False):
        super(SetTransformer, self).__init__()
        self.enc = nn.Sequential(
                ISAB(dim_input, dim_hidden, num_heads, num_inds, ln=ln),)
        self.dec = nn.Sequential(
                PMA(dim_hidden, num_heads, num_outputs, ln=ln),
                nn.Linear(dim_hidden, dim_output))
        self.sigmoid = nn.Sigmoid()

    def forward(self, X):
        X = self.enc(X)
        X = self.dec(X)
        X = self.sigmoid(X)
        return X

You will notice that the class inherits from the torch.nn.Module class. Once you have defined the base model class the next step is to define a wrapper class with all necessary functions needed for training and scoring in Viya. This is very similar to writing train and test functions in PyTorch. You can find the necessary function names and return types on the official documentation page here. The wrapper class definition is below. The two main functions are train_one_batch and score_one_batch, which takes one batch of data and return certain variables like loss, metrics, aggregated loss, and aggregated metrics (for the epoch). The base model definition governs certain aspects of these functions, such as how the input data is processed and reshaped. However, the output that is returned remains independent of the base model.

# Create wrapper class

class SetTransformerModelWrapped (torch.nn.Module):
    def __init__(self, modelBase, loss_module=None):
        
        super(SetTransformerModelWrapped, self).__init__()
        self.modelBase = modelBase
        self.loss_module = loss_module
        self.num_metrics = len(self.get_metric_names())
        self.register_buffer('total_sample', torch.tensor([0]).long())
        self.register_buffer('total_loss', torch.tensor([0.0]).float())
        self.register_buffer('aggr_loss', torch.tensor([0.0]).float())
        self.register_buffer('aggr_metric',  torch.tensor([0.0]*self.num_metrics).float())
    
    # Post processes predictions returned by the model in a list for the batch
    def post_process_module (self, predictions: Tensor)->List[Tensor]:
        pred:List[Tensor] = []
        predToReturn:List[Tensor] = []
        if len(predictions.shape) == 0:
            pred.append(torch.tensor(predictions.item()))
        else:
            for i in range(predictions.shape[0]):
                pred.append(torch.tensor(predictions[i].item()))
        return pred
        
    # Returns accuracy score for a batch.
    def accuracy(self,predictions: Tensor, labels: Tensor):
        return (predictions.to(labels.device) == labels).sum()
    
    # Aggregates loss and metric values for all batches in an epoch
    def aggregate(self, loss: Tensor, metric: Tensor, batchnum: int):
        self.total_sample += batchnum
        self.aggr_loss += loss.to(self.aggr_loss.device) * batchnum
        self.aggr_metric += metric[0:self.num_metrics].to(self.aggr_metric.device)

    # Returns final layer count. I am returning a hard coded value of 1 but you can also count the number programmatically.
    torch.jit.export
    def get_layer_count(self) -> int:
        
        return 1
    
    # Resets aggregated loss and metrics values to 0 after each epoch.
    torch.jit.export
    def prepare_run(self):
        if self.modelBase.training:
            #self.aggr_loss.zero_()
            self.aggr_loss = torch.tensor(0.0)
        else:
            #self.aggr_loss = self.aggr_loss + torch.tensor(0.0)
            self.aggr_loss = torch.tensor(0.0)

        self.total_loss.zero_()
        self.total_sample.zero_()
        self.aggr_metric = torch.tensor([0.0]*self.num_metrics)  #nelememt =0
        return self.aggr_loss

    # trains model on one batch
    torch.jit.export
    def train_one_batch(self,
                        x: List[torch.Tensor], target: List[torch.Tensor]) -> \
            Tuple[Tensor, List[Tensor], Tensor, List[Tensor]]:
            
        data = x[0].reshape(x[0].shape[0],-1,27) # data reshaping to fit base model definition
        probs = self.modelBase(data).squeeze(-1) # get probability score for the samples
        target_class = target[0].double()
        loss = self.loss_module(probs, target_class).to(data.device) # calculate loss
        pred_class = (probs >=0.5).double()
        metric = self.accuracy(pred_class, target_class).reshape([1])
        bt_sz = data.shape[0]
        self.aggregate(loss, metric, bt_sz)
        if self.modelBase.training:
            loss.backward() #backpropogation. Optimizer details are provided in dlmodelzoo train action set. See next code snippet below.
        
        return loss, [metric[i]/bt_sz for i in range(self.num_metrics)], self.aggr_loss/self.total_sample, \
                 [self.aggr_metric[i]/self.total_sample for i in range(self.num_metrics)]
        
    # scores model on one batch. This is akin to running model on test data. This code is eventually used in creating the astore file.
    torch.jit.export
    def score_one_batch(self,
                        x: List[torch.Tensor], target: List[torch.Tensor]) -> \
            Tuple[List[Tensor],Tensor, List[Tensor], Tensor, List[Tensor]]:
        
        data = x[0].reshape(x[0].shape[0],-1,27)
        probs = self.modelBase(data).squeeze(-1)
        if len(target)>0:
            target_class = target[0].double()
            loss = self.loss_module(probs, target_class).to(data.device)
            pred_class = (probs >=0.5).double()
            predictions:List[Tensor] = self.post_process_module(probs)
            metric = self.accuracy(pred_class, target_class).reshape([1])
            self.aggregate(loss, metric, data.shape[0])
            return predictions, loss, [metric[i]/data.shape[0] for i in range(self.num_metrics)], self.aggr_loss/self.total_sample, \
                          [self.aggr_metric[i]/self.total_sample for i in range(self.num_metrics)]
        else:
            
            pred_class = (probs >=0.5).double()
            predictions:List[Tensor] = self.post_process_module(probs)
            
            return predictions, torch.tensor(0.0), [torch.tensor(0.0) for i in range(self.num_metrics)], torch.tensor(0.0), \
                          [torch.tensor(0.0) for i in range(self.num_metrics)]

    # returns the metric name
    torch.jit.export
    def get_metric_names(self)->List[str]:
        return ["Accuracy"]
    
    # returns the metric value. You can return more than one metric value.
    torch.jit.export
    def get_metric_values (self, metric_in:List[Tensor])->List[Tensor]:
        Accuracy = metric_in[0]
        values:List[Tensor] = []
        values.append(Accuracy)
        return values

prepare_run resets loss and metrics values after each epoch and get_metrics_values returns the metrics. Once you have defined the wrapper class the next step is to torchscript an instance of this class. This can be done by using either torch.jit.script or torch.jit.trace. The former is preferred because the latter has some limitations, especially in the areas of not being able to deal with if-else blocks. There are other differences too but the rule of thumb should be to use torch.jit.script for torchscripting in Viya.

# Create base model instance
dim_input = 27
num_outputs = 1
dim_output = 1
base_model = SetTransformer(dim_input, num_outputs, dim_output).to(torch.double)

# Create wrapper class instance that uses base model defined above
wrapped_model = SetTransformerModelWrapped(base_model,nn.BCELoss())
# Script the wrapped model
scripted_wrapped_model = torch.jit.script(wrapped_model)
# Save the model to disk and upload it to a location accessible by Viya
torch.jit.save(scripted_wrapped_model, "./scripted_WrappedSetTransformerModel.pt")

After saving the model to disk you now have a model that can be trained in Viya. Before you do that its also a good practice to test the train_one_batch and score_one_batch functions to make sure they return valid results. An example of testing train_one_batch function is below. In the code below X,Y stands for test data,label that fits the model definition.

import torch.optim as optim
print ("\n Train Tests");
# load the model saved to disk
loaded_scripted_wrapped_model = torch.jit.load("./scripted_WrappedSetTransformerModel.pt")

optimizer = optim.Adam(loaded_scripted_wrapped_model.modelBase.parameters(), lr=1e-4)
# Set model training flag to true
loaded_scripted_wrapped_model.train()
for epoch in range (5):
    # For each epoch set the loss and metrics to zero by calling prepare_Run
    aggr_loss = loaded_scripted_wrapped_model.prepare_run()
    for i in range(0,2):
        optimizer.zero_grad()
        loss, metric, agg_loss, agg_metric =loaded_scripted_wrapped_model.train_one_batch ([X[i*16:(i+1)*16]],[Y[i*16:(i+1)*16]])

        if epoch == 0:
            print (f"batch {i}  loss={loss.item()} accu={metric[0].item()}" )
        optimizer.step()
    print (f"epoch = {epoch} aggr loss",   "{:.6f}".format(agg_loss.item()), "metric", "{:.6f}".format(agg_metric[0].item()))

To train the model in Viya you can use CASL/Python or other supported open source languages. I am going to explain how you can use CASL to train, score, and export the model. Before you do anything the first step is to make the training/scoring data available as a CAS tables. Lets assume the data is loaded to memory and the table names are CS_DATA_TRAIN, CAS_DATA_VAL, and CAS_DATA_TEST. In the CASL code the first step is to define a YAML file that provides more information about the model.

proc cas;
/* Define YAML file*/
source _yaml;
sas:
  dlx:
     
      label: "SET_NN"
      dataset:
        type: NumericTable /* All data for our model is in numeric format.*/
      model:
        type: "TORCHSCRIPT" /* For custom models we use TORCHSCRIPT as the type*/
        path: "scripted_WrappedSetTransformerModel.pt" /* Name of the torchscripted model file */
        caslib: "dnfs" /* Library where the physical *.pt file exist*/
        inputSize: 5400 /* Input data size, i.e., number of columns*/
        featureSize: 1 /* output size */
        inputs:
          - label: input_tensor1
            size:
            - 0
        outputs:
          - label: output_tensor1
            size: 
            - 0

endsource;

/* Define the train action set*/
action dlModelZoo.dlmztrain result=results /
    logLevel='DEBUG',
    table='CVM_DATA_TRAIN', /* Train data CAS table name */
    inputs=${&my_var_list}, /* my_var_list is the macro that stores the column names in this format: col1 col2 col3...*/
    targets="label", /* Name of the column in the training dataset that stores the label value 0 or 1*/
    tableDistribution='REPEATED',
    dropLast=True,
    validationTable = 'CVM_DATA_VAL', /* Validation data CAS table name */
    checkpointBest=True, /* Keep the best performing model*/
    modelOut={name="trained_model", replace=True}, /* Name of the CAS table that stores the trained model weights */
    optimizer={
        mode={
            type='synchronous',
            syncFreq=1
    },
        algorithm={ 
            learningRate=1e-4, /* I am not tuning hyperparameters. If you were using auto tune functionalities that you would instead provide a lr range for example.*/
            
            method='adam'
        },
        batchSize=50,
        seed=54321,
        maxEpochs=20
    },
    extraOptions={yaml=_yaml, label='SET_NN'};

/* Define the Scoring action set. Data is scored on the trained model from previous step*/
dlModelZoo.dlmzscore /
    logLevel='DEBUG',
    table='CVM_DATA_TEST',
    modelTable='trained_model',
    inputs=${&my_var_list},
    targets="label", 
    tableDistribution='REPEATED',
    tableOut={name="Set_Transformer_Results", replace=True},
    batchSize=50,
    copyVars={"label","study_accession","subject_accession"}, /* You can specify additional columns from score/test dataset to be copied to the results. I am copying the truth label and patient identifier values*/
    extraOptions={yaml=train_yaml, label='SET_NN'};

/* Define the export actionset. This will save the model state that can be exported as an Astore file */
dlModelZoo.dlmzExport /
    logLevel='INFO',
    modelTable="trained_model",
    saveState={name="astore_ST200Cells_model_dlmzexp", caslib="casuser", replace=True},
    extraOptions={yaml=_yaml, label='SET_NN'},
	table='CVM_DATA_TEST',
	inputs=${&my_var_list};

run;

Finally, to export the model as an Astore file that you can import in SAS Model Studio you can use proc astore. See an example below.

proc astore;
    download rstore=casuser.astore_ST200Cells_model_dlmzexp 
             store="/nfsshare/data/Mass Cytometry/ASTORE_SetTransformer_model_200cells.sasast";
quit;

In this part, we explored how to train a custom PyTorch model in Viya. The data used in this model is multi-dimensional, meaning each input consists of multiple rows within a CAS table. In part two, I will delve into the challenges of working with multi-dimensional data and related workarounds. Stay tuned!

Training PyTorch deep learning models in Viya & dealing with multi dimensional tabular data - Part I

SAS Innovate 2025: Call for Content

Free course: Data Literacy Essentials

Get Started