BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ldavis020
Fluorite | Level 6

Hello! 

I am tying to create a new variable that applies a set of rules to an ordered variable in my dataset. We apply the rules to each row of a variable successively. If rule 1 fails, we move on to rule 2, if rule 2 fails we move on to rule 3 if rule 3 fails we move on to rule 4 and if rule 4 fails we stop applying rules at that row and move on to the next group.

My data consists of weekly counts by group. The idea is to apply the rules successively to the main row plus the rows above and below for each group. 

The rules go like this: 

  • Rule 1 applies when count => 3 (easy)
  • Rule 2 applies when 2/3 successive rows containing the main row are => 2 (i.e. main row + 2 rows above OR main row + 1 row above and 1 row below OR main row + 2 rows below)
  • Rule 3 applies when 4/5 successive rows containing the main row are => 1 (i.e. main row + 4 rows above OR main row + 3 rows above and 1 row below OR main row + 2 rows above and 2 rows below.. etc. etc.)
  • Rule 4 applies when 8/8 successive rows contain the main row are => 0 (i.e. main row + 7 rows above OR main row + 6 rows above and 1 row below OR main row + 5 rows above and 2 rows below.. etc. etc.)

My main issue here is that I'm not sure how to indicate to sas that I want to call the rows above/below my main row.. 

 

Here is an example of my data. I want to apply the rule to the count column and then create a new column called "rule" that applies the rules.

data have;
input group count rule;
datalines; 1 14.81 1 1 69.33 1 1 17.70 1 1 7.5 1 1 3.1 1 1 5.41 1 1 1.75 2 1 1.97 2 1 3.12 1 1 2.66 2 1 0.83 2 1 1.06 3 1 0.60 4 1 -0.30 0 1 1.75 0 1 -0.8 0 ; run;

 My SAS skills are proficient enough to get the job done but still fairly rudimentary, so any help or direction would appreciated 🙂 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It is easier to look back than ahead in SAS code.

One way to approach is to roll the values out into an array and then scan that to apply your rules.

If I assume that for RULE 2 you actually meant to scan the 2 weeks before and 2 weeks after the current week for run lengths of 2 or more weeks then this code replicates rule 1 and rule2.  The other rules are just simple modifications to the rule 2 rule.

data want;
  array counts[102] ;
  do week=1 by 1 until(last.group);
     set have ;
     by group week;
     counts[week]=count;
  end;
  maxweek=week;
  do week=1 to maxweek ;
     count=counts[week];
     rule=.;
     if 3 <= count then rule=1; 
     if missing(rule) then do;
       runlength=0;
       do check=max(1,week-2) to min(maxweek,week+2) until(runlength>1);
          if 2 <= counts[check] then runlength+1;
          else runlength=0;
       end;
       if runlength > 1 then rule=2;
     end;
     if missing(rule) then do;
       * similar logic for rule 3 ;
     end;
     output;
  end;
  keep group week rule ;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

You want this to work by group right? Even though you only have one group in your data?

ldavis020
Fluorite | Level 6

Yes sorry, I want to apply the rules by group. So there would be more lines in the data with additional groups. 

Tom
Super User Tom
Super User

It is easier to look back than ahead in SAS code.

One way to approach is to roll the values out into an array and then scan that to apply your rules.

If I assume that for RULE 2 you actually meant to scan the 2 weeks before and 2 weeks after the current week for run lengths of 2 or more weeks then this code replicates rule 1 and rule2.  The other rules are just simple modifications to the rule 2 rule.

data want;
  array counts[102] ;
  do week=1 by 1 until(last.group);
     set have ;
     by group week;
     counts[week]=count;
  end;
  maxweek=week;
  do week=1 to maxweek ;
     count=counts[week];
     rule=.;
     if 3 <= count then rule=1; 
     if missing(rule) then do;
       runlength=0;
       do check=max(1,week-2) to min(maxweek,week+2) until(runlength>1);
          if 2 <= counts[check] then runlength+1;
          else runlength=0;
       end;
       if runlength > 1 then rule=2;
     end;
     if missing(rule) then do;
       * similar logic for rule 3 ;
     end;
     output;
  end;
  keep group week rule ;
run;
ldavis020
Fluorite | Level 6

Thanks @Tom -- this seems to work with the example data! I haven't applied it to my own dataset yet. 

Just to clarify what's happening here. You're essentially creating a new variable that represents the value for each week in each group (e.g. variable count23 = value for week 23 in each group). Then we can go scan across each of the variables and apply the rules instead of scanning through one variable vertically. 

 

Tom
Super User Tom
Super User

So essentially it is making a temporary array of all of the values for this group.

The first DO loop loads the array.  Then the second pulls the individual values back out and so the rule checking can be down with the values in the array.

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1655 views
  • 1 like
  • 3 in conversation