BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
morglum
Quartz | Level 8

Hi everyone,

 

Is it possible to import a gradient boosting model developed in R inside SAS real-time decision maker? 

 

We can export it either a xgboost binary files   or as a bunch of if-then-else statements in SQL  .

 

thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
JamesAnderson
SAS Employee
Hi Morglum,
The road to RTDM for non-SAS models is via one of 4 routes:

1. Have RTDM call the R server in real time via a web service or similar and have R execute the model and provide the results back to RTDM
2. Convert the model to PMML and import it into SAS which then converts it to SAS code that can be executed natively in RTDM
3. Write a groovy program to call the R model and execute it in RTDM's Java runtime
4. Hand craft a SAS program from the If-Then-Else model rules from R
Option 1 may not be possible depending on your SLAs, volume of traffics and R infrastructure. If it is then depending on your version of RTDM you can use SAS DS2, Groovy or Jython programming languages to write the integration to R.
Option 2 requires the use of PMML4.2, but also means you need SAS Enterprise Miner and SAS Model Manager to handle the conversion and publishing of the model to something that can be deployed in RTDM. On the list of model types supported for import via PMML GBM is notably missing, which suggests a good deal of uncertainty as to if this would work.
Option 3 requires you to be on a release 6.3 or higher of RTDM, to install rJava package on the RTDM servers, and then code a Groovy module to call and execute R code.
Option 4 requires you to take your model rules and manually turn them into a SAS DS2 program so that it can run natively inside RTDM. Depending on the size of the model and the frequency at which you want to re-train it, this may become untenable.
Regards
James

View solution in original post

9 REPLIES 9
JamesAnderson
SAS Employee
Hi Morglum,
The road to RTDM for non-SAS models is via one of 4 routes:

1. Have RTDM call the R server in real time via a web service or similar and have R execute the model and provide the results back to RTDM
2. Convert the model to PMML and import it into SAS which then converts it to SAS code that can be executed natively in RTDM
3. Write a groovy program to call the R model and execute it in RTDM's Java runtime
4. Hand craft a SAS program from the If-Then-Else model rules from R
Option 1 may not be possible depending on your SLAs, volume of traffics and R infrastructure. If it is then depending on your version of RTDM you can use SAS DS2, Groovy or Jython programming languages to write the integration to R.
Option 2 requires the use of PMML4.2, but also means you need SAS Enterprise Miner and SAS Model Manager to handle the conversion and publishing of the model to something that can be deployed in RTDM. On the list of model types supported for import via PMML GBM is notably missing, which suggests a good deal of uncertainty as to if this would work.
Option 3 requires you to be on a release 6.3 or higher of RTDM, to install rJava package on the RTDM servers, and then code a Groovy module to call and execute R code.
Option 4 requires you to take your model rules and manually turn them into a SAS DS2 program so that it can run natively inside RTDM. Depending on the size of the model and the frequency at which you want to re-train it, this may become untenable.
Regards
James
morglum
Quartz | Level 8
Wow!
Thanks for the very informative reply.

Option 1 - I'll look into how reliable/poweful my R infrastructure is. I think this is the most likely option given my relative knowledge of R versus absolute lack of knowledge of rJAva / groovy / SAS DS
Option 2- Lack of GBM support disqualifies this option.
Option 3 - I'm not sure I understand the workflow of option 3. Does it require an R server, or is the R model exported in some format and imported using rJava/Groovy (which are both new to me)?
Option 4 - The model has a few hundred trees, so we definitely don't want to have to rewrite the program manually. I can generate SQL code in any of access, hive, myusql, mssql, odbc, oracle, postgres, sqlite or teradata. Could any of these be "pasted" inside a proc ds2 step?

We are able to generate a large table with each possible combination of features ( gender = M, education = university, etc..) plus the predicted value for this combination of features(wages= 40000$). Could such a table simply be improted in RTDM and "left joined" to get the prediction?


JamesAnderson
SAS Employee
For Option 1, you are still going to need to do some DS2/Jython/Groovy programming to create the necessary web services calls from RTDM -> R Server. There are examples of this on the net to make it easy, but might not be something you want to entertain.
Option 3 is embedding your R code inside a Java/Groovy program, and then calling that program inside RTDM. Im not familiar with the R tools that facilitate this, but suffice to say there would be a bunch of Java/Groovy programming involved. However from what I can tell this allows you to use your R model directly ("cut and paste" the code) inside Java, and hence doesn't require an R server at execution time.
Option 4 is probably the one that requires the lest amount of "foreign" programming, but will need you to learn a little bit of SAS DS2 to be the wrapper around your exported code. Presumably your exported/converted code uses CASE statements in SQL - in order to use these in DS2 inside RTDM you would need to change these to IF-THEN-ELSE - so perhaps a find replace macro can automate this for you so you can paste the code into RTDM. Certainly doable.
Regarding the lookup table, how big is it (rows and columns) and what version of RTDM are you using ?
HTH,
James
morglum
Quartz | Level 8
Thanks James,

Converting the CASE statements (you guessed right) to IF-THEN-ELSE sound like a promising approach. Are they normal "data step" statements, or all proc ds2 if-the-else different?

Regarding the lookup table, it could be anywhere 1,000 rows and 100 columns to 1M rows and 200 columns. I'd adapt the level of detail to what RTDM can handle. Do I understand from this that RTDM can work with lookup tables? I'll ask what version of RTDM we have.

thanks!
Simon

JamesAnderson
SAS Employee

The IF-THEN-ELSE are the same as normal DS, but the code that wraps around them in DS2 is different to DATA step - I've included a mock up below.

As for the Lookups, for very small table we can use the Cross Table node inside a diagram, but for the size you mention its too large. We can lookup data from a table stored in a data set or database using a Data Process inside the diagram. This is a high performing query and can use a very large table (based on the database). But for this I would probably go with coded rules.

Cheers

James

 

package GBM_Model/overwrite=yes;

/* declare a logger */
declare package logger m_logger('morglum.GMB_Model');

/* set logging methods */

method trace(varchar(32767) msg);
  m_logger.log(2, msg);
end;

method debug(varchar(32767) msg);
    m_logger.log(3, msg);
end;

method err(varchar(32767) msg);
  m_logger.log(6, msg);
end;

method isTraceEnabled() returns int;
  return m_logger.islevelactive(2);
end;

method isDebugEnabled() returns int;
  return m_logger.islevelactive(3);
end;


/* execute method which gets called by RTDM - the inputs here need to be defined in the RTDM Activity and values passed from the decision flow */

method execute( char gender	
				, char education
				, int age_years
				, in_out double predicted_wages
				); 
				
		/* log messages */	
		if isDebugEnabled() then	
			debug('Entered Execute Method');
		
		/* optional trace messages - maybe not great if there are 100s of inputs */
		if isTraceEnabled() then
			do;
				trace('Input gender:' gender);
				trace('Input education:' education);
				trace('Input age_years:' age_years);
			end;

		/* Model IF-THEN-ELSE rules */
		
		if (gender EQ 'Male' and education = 'university' and age_years > 50) then
			/* if only one output variable the wont need a do block, just set the variable after then on the line above */
			do;
				/* set outputs */
				predicted_wages = 45001.10 ;
			end;
		else if (gender EQ 'Female' and education = 'university' and age_years > 50) then
			do;
				/* set outputs */
				predicted_wages = 45001.10 ;
			end;
		else
			/* final block */
			do;
				predicted_wages = 0.0;
			end;
			
		/* End of Model IF-THEN-ELSE rules */	
			
		/* log messages */	
		if isDebugEnabled() then	
			debug('Finished Execute Method');
		
		if isTraceEnabled() then
			do;
				trace('Output predicted_wages:' predicted_wages);

			end;
   end;  
end;

endpackage;

 

morglum
Quartz | Level 8
Thanks James!

This looks great - I'll ask my local SAS RTDM experts if we can get this work work.

I'm stocked!
ampdietrick
SAS Employee

While working with another customer to use the sample code provided, we had to make a few modifications to make it work in an RTDM 6.5 SAS Process.

 

Here is the revised sample code:

package GBM_Model/overwrite=yes;
   /* declare a logger */
   declare package logger m_logger('morglum.GBM_Model');
   /* set logging methods */
   method trace(varchar(32767) msg);
      m_logger.log(2, msg);
   end;
   method debug(varchar(32767) msg);
      m_logger.log(3, msg);
   end;
   method err(varchar(32767) msg);
      m_logger.log(6, msg);
      end;
   method isTraceEnabled() returns int;
      return m_logger.islevelactive(2);
   end;
   method isDebugEnabled() returns int;
      return m_logger.islevelactive(3);
   end;
   /* execute method which gets called by RTDM - the inputs here need to be defined in the RTDM Activity and 
      values passed from the decision flow */
   method execute( char(10) gender,
                   char(30) education,
                   int age_years,
                   in_out double predicted_wages
                  ); 
      /* log messages */
      if isDebugEnabled() then
         debug('Entered Execute Method');

      /* optional trace messages - maybe not great if there are 100s of inputs */
      if isTraceEnabled() then
      do;
         trace('Input gender:' || gender);
         trace('Input education:' || education);
         trace('Input age_years:' || age_years);
      end;

      /* Model IF-THEN-ELSE rules */
      if (gender EQ 'Male' and education = 'university' and age_years > 50) then
      /* if only one output variable the wont need a do block, just set the variable after then on the line above */
      do;
         /* set outputs */
         predicted_wages = 45001.10 ;
      end;
      else if (gender EQ 'Female' and education = 'university' and age_years > 50) then
      do;
         /* set outputs */
         predicted_wages = 45002.10 ;
      end;
      else
         /* final block */
      do;
         predicted_wages = 0.0;
      end;
      /* End of Model IF-THEN-ELSE rules */

      /* log messages */
      if isDebugEnabled() then
         debug('Finished Execute Method');


      if isTraceEnabled() then
      do;
         trace('Output predicted_wages:'|| predicted_wages);
      end;

   end;

endpackage;
run;
quit;

/* code to test with your Federation Server via SAS Studio or EG */
proc ds2 nolibs NOPROMPT="driver=remts;server=your-fed-server;port=24141;protocol=bridge;uid=your-user;pwd=your-pw;conopts=(driver=ds2;conopts=(DSN=BASE_DSN))";
data _null_;
   method run();
      declare package GBM_Model GBMM();
      declare double pw;
      GBMM.execute('Male', 'university', 51, pw);
      put pw=;
      GBMM.execute('Female', 'university', 64, pw);
      put pw=;
      GBMM.execute('Male', 'high school', 51, pw);
      put pw=;
   end;
enddata;
run;
quit;
morglum
Quartz | Level 8

thanks for this update!

 

I am  a typical base SAS user and my RTDM person has never used proc DS2 either.  

 

My question for today is : where does this code go?    I've followed a proc ds2 tutorial (https://www.lexjansen.com/sesug/2017/HOW-190.pdf) where you run proc ds2 inside SAS EG, but where would the code live in the context of SAS RTDM?   Is there a guide somewhere on how to call proc ds2 from SAS RTDM?

 

thanks

Simon

 

 

 

JamesAnderson
SAS Employee
Hi Simon,
Take a look at the RTDM User guide (Pg 95 of the current user guide for RTDM6.5) - you will need to create a SAS Process which is done in the Definitions workspace in the Decision Definitions section within CI Studio. When you get to the "SAS Code" part of the process definition you copy the code starting from the "package" key word i.e. you don't need to wrap the code in PROC DS2. Once you have created the process definition you can then add the process you created as a node in the diagram within a decision campaign - which means that when RTDM executes the decision campaign it will execute the DS2 code and run your model.
Cheers
James

Review SAS CI360 now.png

 

Want to review SAS CI360? G2 is offering a gift card or charitable donation for each accepted review. Use this link to opt out of receiving anything of value for your review.

 

 

 

 

SAS Customer Intelligence 360

Get started with CI 360

Review CI 360 Release Notes

Open a Technical Support case

Suggest software enhancements

Listen to the Reimagine Marketing podcast

Assess your marketing efforts with a free tool

 

Training Resources

SAS Customer Intelligence Learning Subscription (login required)

Access free tutorials

Refer to documentation

Latest hot fixes

Compatibility notice re: SAS 9.4M8 (TS1M8) or later

 

 

How to improve email deliverability

SAS' Peter Ansbacher shows you how to use the dashboard in SAS Customer Intelligence 360 for better results.

Find more tutorials on the SAS Users YouTube channel.

Review SAS CI360 now.png

 

Want to review SAS CI360? G2 is offering a gift card or charitable donation for each accepted review. Use this link to opt out of receiving anything of value for your review.

 

 

 

 

SAS Customer Intelligence 360

Get started with CI 360

Review CI 360 Release Notes

Open a Technical Support case

Suggest software enhancements

Listen to the Reimagine Marketing podcast

Assess your marketing efforts with a free tool

 

Training Resources

SAS Customer Intelligence Learning Subscription (login required)

Access free tutorials

Refer to documentation

Latest hot fixes

Compatibility notice re: SAS 9.4M8 (TS1M8) or later

 

 

Discussion stats
  • 9 replies
  • 2079 views
  • 6 likes
  • 3 in conversation