I am trying to create the rules. For example, we always start at HIGH cases --> if there are two values in a row by each group were the it falls between 11 through 20, i would like to assign it High regardless of the whether those two high values are in the top, middle, or or bottom for that group. If they are not High, then we see if there are two consequective values that are between 1 through 10 and assign them "LOW"
I have two legends - High (11 through 20) and Low (1 through 10)
Here's the sample dataset:
GROUP | VALUE | seq_id |
1 | 12 | 1 |
1 | 4 | 2 |
1 | 6 | 3 |
1 | 5 | 4 |
1 | 13 | 5 |
4 | 12 | 1 |
4 | 13 | 2 |
4 | 1 | 3 |
4 | 2 | 4 |
5 | 19 | 1 |
5 | 2 | 2 |
5 | 3 | 3 |
5 | 14 | 4 |
5 | 20 | 5 |
Here's the output:
ID | value | seq_id | output |
1 | 12 | 1 | |
1 | 4 | 2 | Low |
1 | 6 | 3 | |
1 | 5 | 4 | |
1 | 13 | 5 | |
4 | 12 | 1 | High |
4 | 13 | 2 | |
4 | 1 | 3 | |
4 | 2 | 4 | |
5 | 19 | 1 | |
5 | 2 | 2 | |
5 | 3 | 3 | |
5 | 14 | 4 | High |
5 | 20 | 5 |
In group 1, the value for seq_id 1 is 12 and for seq_id 5 is 13, so shouldn't output for group 1 be HIGH?
Not too sure which row you want the OUTPUT value on. Just merge back as wanted.
This works:
data WANT;
set HAVE;
length OUTPUT $8;
retain OUTPUT;
by GROUP;
if first.GROUP then OUTPUT=' ';
if lag(GROUP)=GROUP and lag(VALUE)>10 and VALUE>10 then OUTPUT='High';
if lag(GROUP)=GROUP and lag(VALUE)<=10 and VALUE<=10 and OUTPUT= ' ' then OUTPUT='Low';
if last.GROUP then output;
keep GROUP OUTPUT;
run;
GROUP | OUTPUT |
1 | Low |
4 | High |
5 | High |
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.