Hello I have the following dataset but I am really stuck on how to do the hierarchy for few weeks actually.
Here's what are the groupings:
> 20 = Too High
16-20 = High
10-15 = Low
1-5 = Normal
I would like to do a "Check" on Each ID (i.e. Group).
Here is the dataset
ID | value | seq_id |
1 | 3 | 1 |
1 | 1 | 2 |
1 | 2 | 3 |
1 | 4 | 4 |
1 | 1 | 5 |
2 | 20 | 1 |
2 | 21 | 2 |
2 | 5 | 3 |
3 | 1 | 1 |
3 | 6 | 2 |
3 | 7 | 3 |
3 | 1 | 4 |
3 | 4 | 5 |
3 | 21 | 6 |
3 | 17 | 7 |
3 | 1 | 8 |
3 | 2 | 9 |
4 | 6 | 1 |
4 | 9 | 2 |
4 | 12 | 3 |
4 | 13 | 4 |
5 | 1 | 1 |
5 | 6 | 2 |
5 | 7 | 3 |
5 | 14 | 4 |
5 | 20 | 5 |
6 | 11 | 1 |
6 | 12 | 2 |
6 | 25 | 3 |
6 | 16 | 4 |
6 | 21 | 5 |
Here's the output I am looking for:
ID | value | seq_id | output |
1 | 3 | 1 | Normal |
1 | 1 | 2 | |
1 | 2 | 3 | |
1 | 4 | 4 | |
1 | 1 | 5 | |
2 | 20 | 1 | Too High |
2 | 21 | 2 | |
2 | 5 | 3 | |
3 | 1 | 1 | |
3 | 6 | 2 | |
3 | 7 | 3 | |
3 | 1 | 4 | |
3 | 4 | 5 | |
3 | 21 | 6 | Too High |
3 | 17 | 7 | |
3 | 1 | 8 | |
3 | 2 | 9 | |
4 | 6 | 1 | |
4 | 9 | 2 | |
4 | 12 | 3 | High |
4 | 13 | 4 | |
5 | 1 | 1 | |
5 | 6 | 2 | Low |
5 | 7 | 3 | |
5 | 14 | 4 | |
5 | 19 | 5 | |
6 | 11 | 1 | |
6 | 12 | 2 | |
6 | 25 | 3 | Too High |
6 | 16 | 4 | |
6 | 21 | 5 |
|
We can't answer your question because it's not clear. Which variable matches the 'group' variable you mention? How do the other variables come Into play.
Hello. Group = ID. I would like to group them each ID. For example if group ID = 1, I would like to see if there are two consequtive 'too high' numbers. If no, I would like to see if there are two conesequitve 'high' numbers. If yes, then I would like to assign that 'high' to the first value of 'high'. If it is not HIGH, then I would like to see if there are two consequtives numbers that are 'low' category, and so on.
There's always going to be two continouous values of group but I would like to start by too high, high, low, normal.
Thank you!
data have;
infile cards dlm='09'x;
input ID value seq_id;
cards;
1 3 1
1 1 2
1 2 3
1 4 4
1 1 5
2 20 1
2 21 2
2 5 3
3 1 1
3 6 2
3 7 3
3 1 4
3 4 5
3 21 6
3 17 7
3 1 8
3 2 9
4 6 1
4 9 2
4 12 3
4 13 4
5 1 1
5 6 2
5 7 3
5 14 4
5 20 5
6 11 1
6 12 2
6 25 3
6 16 4
6 21 5
;
run;
proc format library=work;
value labelfmt
1 = 'too high'
2 = 'high'
3 = 'low'
4 = 'normal'
;
run;
data want (keep=id output);
set have;
by id;
retain flag;
if first.id then flag = ' ';
lagval = lag(value);
if value > 20 then substr(flag,1,1) = 'x';
else if not first.id
then do;
if 16 <= value <= 20 and 16 <= lagval <= 20 then substr(flag,2,1) = 'x';
else if 10 <= value <= 20 and 10 <= lagval <= 20 then substr(flag,3,1) = 'x';
else substr(flag,4,1) = 'x';
end;
if last.id
then do;
pos = index(flag,'x');
output = put(pos,labelfmt.);
output;
end;
run;
Note that in your example, group 4 was incorrectly labeled "high", as values 12 & 13 fall into your "low" category.
PS although I used your variable name "output", I recommend to not use SAS keywords as variable names.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.