BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Forecaster
Obsidian | Level 7

I have dataset with 3000 customers and each customer has 120 observations. I was trying to build predictive model for each customer.

There are several SAS procedures such as proc GLMSELECT or proc GLM that support by group processing. I was wondering if SAS enterprise miner has the option of "by" group processing ?

If so could you please post an example or an illustraction on how this could be done. I checked the SAS EM user notes, and was not sucessful.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
rayIII
SAS Employee

Hi. Please take a look at the group processing nodes in Enterprise Miner. Specifically "Start Groups" and "End Groups".

"...the group processing facility can be used to:

analyze more than one target variable in the same process flow

define group variables such as GENDER or JOB, in order to obtain separate analyses for

each level of the group variable or variables.

use cross validation techniques to test the stability of predictive models

specify index looping, or how many times the flow following the node should loop

resample the data set to create bagging and boosting models"

http://support.sas.com/resources/papers/proceedings10/123-2010.pdf

Hope this helps.

Ray

View solution in original post

5 REPLIES 5
jakarman
Barite | Level 11

The product is assuming it should analyze the data finding groups  (binning) .
What is you goal with group processing? Do you know the wanted result and need the model that?

---->-- ja karman --<-----
M_Maldonado
Barite | Level 11

Dear Forecaster,
The only node that I can think of that does something similar is the Survival Node.

If you specify the option Data Format as Fully Expanded, you will have the analysis done per ID variable as in your example (multiple rows for each customer ID as long as you define customer with a Role of ID).

If you are analyzing (or forecasting) a time series, you might have the data already in the right format to use the Time Series nodes in Enterprise Miner 13.1.

In general, the rest of the nodes expect a summary of all inputs and a target for each customer.

Just recently I was talking with a customer about an idea of a Feature Engineering node in which it would summarize variables per ID. All inputs would be summarized as frequencies, counts, and sums to assist you in the task of collapsing a dataset like yours (120X3000 rows) into a summary of 120 rows per customer ID. Would this be useful to you for future releases?

All feedback is truly appreciated!

Thanks,

Miguel Maldonado

Enterprise Miner R&D

Forecaster
Obsidian | Level 7

Miguel, Thank you very much for your response.  I was trying to build a neural networks regression for a  time series data by each customer. SAS stat SAS ets have excellent facility such as proc glmselect, gam, adaptive reg etc.,

For example;

proc glmselect data = input;

    by id;

    model sales = x1 - a10/selection = none;

     score data = output  out = pred;

run;

I wish we have similar features in SAS EM, for example proc neural  or SAS EM.

A future release incorporating this would be extremely helpful for time series data ming and forecasting problems

You could also read this blog that echos my appreciation on by group processing in SAS procedures. It owuld be great to have this SAS EM too!! Learning R has really made me appreciate SAS | randyzwitch.com

Much appreciated

Regards

jakarman
Barite | Level 11

Forecaster, I do not understand why you are refering to SAS/Stat and doing old style coding by using a proc statement.

The way Eminer is working is a graphical approach using nodes (a node can use procs).

The latest version (13.1) of Eminer also support time-series  (new nodes) when digging ETS can be found below.  

As every node can be a model-task a new node is: Open Source Integration Node

The Open Source Integration node enables you to write code in the R language inside of SAS Enterprise Miner. The Open Source Integration node makes SAS Enterprise Miner data and metadata available to your R code and returns R results to SAS Enterprise Miner.

In addition to training and scoring supervised and unsupervised R models, the Open Source Integration node allows for data transformation and data exploration.

 

As the modelling proces itself is automated by Eminer using "model nodes" I think you need to see Eminer different as you have done.

It is thinking one level above the programming in R approach as it is on level above programming in SAS.

When you have different customers each delivering their own data it could be that the resulting Eminer model of every customers is different. By that you can not use the old by approach of proc statements. It is the miner project itself that can need some adjustions somewhere.
Saying that it is rather easy to duplicate Miner project and make changes afterwards
 



 



---->-- ja karman --<-----
rayIII
SAS Employee

Hi. Please take a look at the group processing nodes in Enterprise Miner. Specifically "Start Groups" and "End Groups".

"...the group processing facility can be used to:

analyze more than one target variable in the same process flow

define group variables such as GENDER or JOB, in order to obtain separate analyses for

each level of the group variable or variables.

use cross validation techniques to test the stability of predictive models

specify index looping, or how many times the flow following the node should loop

resample the data set to create bagging and boosting models"

http://support.sas.com/resources/papers/proceedings10/123-2010.pdf

Hope this helps.

Ray

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2704 views
  • 4 likes
  • 4 in conversation