Juletip #4: Going generative with Christmas trees

askojuvonen1 · Posted 12-04-2023 01:49 AM

The other day in November I was about to build

an image classification model with SAS tools.

I already started googling for examples, but because

I'm lazy, I gave up and decided to give ChatGPT a chance to shine.

So what I asked first was:

Write me some SAS code for training image classification model

This is what I got as answer:

Certainly! To train an image classification model in SAS, you can use the 
deep learning capabilities provided by SAS Viya, which includes a high-level 
deep learning API. Below is an example of SAS code for training an image classification model 
using a deep learning neural network.

Before you use this code, make sure you have SAS Viya installed and configured on your system. 
You also need your dataset of labeled images.

/* Load necessary packages */
%let targetServer = YOUR_TARGET_SERVER;
%let targetPort = YOUR_TARGET_PORT;
%let targetUser = YOUR_USERNAME;
%let targetPassword = YOUR_PASSWORD;

options dlCreateCnnConfig = YES;

/* Connect to the CAS server */
cas mySession sessopts= (caslib='CASUSER',host="&targetServer",port=&targetPort, 
                          user="&targetUser",password="&targetPassword");

/* Load the image data into CAS */
caslib _caslib_ sessopts=(caslib=CASUSER);

data _caslib_.your_image_data;
   /* Use the IMAGEFILE function to load image files into a CAS table */
   /* Specify your image file directory */
   image_bytes = imagefile('/path/to/your/images', '*.jpg', 'Label', 'jpg');
run;

/* Define the deep learning model */
data _caslib_.cnn_model;
   retain caslib 'CASUSER';
   inputTable = '_caslib_.your_image_data';
   modelType = 'CNN';
   target = 'Label';
   target = { "Label" };

   /* Define the convolutional layers */
   modelOpt = modelweights('/path/to/pretrained_weights.weights');
   numConv = 2;
   convLayer = { { 6, 5 }, { 12, 5 } };
   convAct = 'RELU';
   poolLayer = { { 2 }, { 2 } };
   poolAct = 'RELU';

   /* Define the fully connected layers */
   numFC = 2;
   fcLayer = { 100, 50 };
   fcAct = 'RELU';

   /* Define the output layer */
   outputAct = 'SOFTMAX';
   n=3;

   /* Specify the optimization parameters */
   casout = casOut( append=T );
run;

/* Train the deep learning model */
proc deepLearn trainSeed=12345 data=_caslib_.cnn_model;
   targetModel=_caslib_.your_trained_model;
   savestate state=_caslib_.your_model_state;
   casout= { caslib = _caslib_ name = 'cnn_model_training' replace=TRUE };
run;

/* Assess the model's performance */
proc deepLearn assess seed=12345 data=_caslib_.your_image_data;
   targetModel=_caslib_.your_trained_model;
   casout= { caslib = _caslib_ name = 'cnn_model_assessment' replace=TRUE };
run;

/* Terminate the CAS session */
cas mySession terminate;

/* View the assessment results */
proc print data=_caslib_.cnn_model_assessment;
run;

Great! Nice one with first attempt.

But this was too simple network in my opinion: CNN with only two layers.

I knew that in CAS Deep Learning actions, there are more complex built-in CNN

models for image classification purposes.

So, my next request was:

I know there is built-in Resnet50 in SAS. Rewrite the previous code

Oh yes, ChatGPT knew it too! :

SAS provides a built-in ResNet50 model that you can use for image classification. 
You can train a ResNet50 model with your own dataset using the deep learning capabilities in SAS Viya. 
Here's how you can modify the previous code to use the ResNet50 model..:

To making the story a bit shorter, I won't include the modified code here.

The given code did most obviously what I needed.

But I wanted to demonstrate the Viya Deep Learning capabilities to my open-source friend.

Could I do the same using Python? I had heard that there is something called 'dlpy' that

enables using Viya deep learning actions in Python.

So, I blindly asked this:

Can you do the same with dlpy?

Answer:

Certainly! You can use SAS DLPy, a Python library for deep learning in SAS, 
to train an image classification model using the ResNet50 architecture. 
Here is an example of how to do it:

# Import necessary libraries
import dlpy
from dlpy.images import ImageTable
from dlpy.applications import ResNet50
from dlpy.model import Model
from dlpy.layers import Input, Output
from dlpy.utils import display_table
from dlpy import gpuinfo
from dlpy.metrics import accuracy_score
from dlpy.model import Gpu
from swat import CAS

# Initialize a CAS session
s = CAS(hostname='YOUR_TARGET_SERVER', port=YOUR_TARGET_PORT, username='YOUR_USERNAME', password='YOUR_PASSWORD')

# Load the image data into CAS
image_table = ImageTable.load_files(s, path='/path/to/your/images', casout=dict(name='your_image_data', replace=True))

# Define the ResNet50 model
model = ResNet50(s, model_table='ResNet50',
                width=224, height=224, random_flip='HV', random_crop='RESIZETHENCROP',
                offsets=(103.939, 116.779, 123.68), scale=1, random_mutation='none',
                pre_trained_weights=True, pre_trained_weights_file='/path/to/pretrained_weights.weights')

# Specify the training settings
model.print_summary()
model.fit(data=image_table, n_threads=4, max_epochs=100, lr=0.001, gpu=Gpu(devices=[1]), target='Label')

# Evaluate the model on the validation data
model.evaluate(validation_data)

# Display the evaluation results
print(model.valid_res_tbl)

# Disconnect from CAS
s.close()

How cool!

But now it started to interest me if this code really works.

And - it appeared to be almost good for running. The most essential fix was the

name of the Resnet constructor (Resnet50_Caffe instead of Resnet50).

Then, there was some minor fixes and additions like removing unnecessary import-clauses, removing everything that refers into GPU since I didn't have one in my environment, dividing dataset into train-test -partitions, testing and evaluating the model.. I might have got them too if I had played with ChatGPT little bit longer.

BTW, what was my model about?

As Christmas is approaching, I wanted to train a model that classifies pictures of trees: Christmas trees into first category and non-Christmas trees into second.

I used ~600 Christmas tree pictures and ~1000 non-Christmas tree pictures as training material from Open Images Dataset, a rich source of labeled images of all sorts.

Resnet50 performed quite well in predicting these image classes!

Testing with 319 images I got following accuracy figures: