I am trying to compute an average within each ID group but with constraints.
For example, I would like to start taking average of two cells that are continuously ABOVE 25 and assign it a value.
- if the average is between 26-50 then 'Medium'
- if the average is between 51-60 then 'High'
If the average needs to start for the FIRST value after 25.
Here's the dataset:
ID | MAXVEL | seq_id |
A | 15 | 1 |
A | 27 | 2 |
A | 29 | 3 |
A | 52 | 4 |
B | 60 | 1 |
B | 26 | 2 |
B | 40 | 3 |
B | 2 | 4 |
B | 9 | 5 |
B | 45 | 6 |
C | 60 | 1 |
C | 12 | 2 |
C | 60 | 3 |
C | 50 | 4 |
C | 12 | 5 |
Output:
For example, starting point for A group will be at 27 as 15 is less than 26
ID | MAXVEL | seq_id | Average | Output |
A | 15 | 1 | ||
A | 27 | 2 | 28 | Medium |
A | 29 | 3 | ||
A | 52 | 4 | ||
B | 60 | 1 | 43 | Medium |
B | 26 | 2 | ||
B | 40 | 3 | ||
B | 2 | 4 | ||
B | 9 | 5 | ||
B | 45 | 6 | ||
C | 60 | 1 | ||
C | 12 | 2 | ||
C | 60 | 3 | 55 | High |
C | 50 | 4 | ||
C | 12 | 5 |
<
Thank you! 🙂
Any help would be greatly appreciated!
data have;
input ID$ MAXVEL seq_id;
cards;
A 15 1
A 27 2
A 29 3
A 52 4
B 60 1
B 26 2
B 40 3
B 2 4
B 9 5
B 45 6
C 60 1
C 12 2
C 60 3
C 50 4
C 12 5
;
data want;
length cnt $10;
set have;
by id;
retain cnt val;
if first.id then do;cnt='';val=.;end;
if maxvel >25 then check='1';
if check ne '' then cnt=cats(check,cnt);
if check='' then cnt='';
if cnt in ('1','11') then do;
val+maxvel;
end;
seq=lag(seq_id);
if check='' then val=.;
if cnt='11' then val=val/2;
if val in (26:50) then out='Medium';
else if val in (51:60) then out='High';
run;
proc sort data=want;
by id seq_id ;
run;
data want2(drop=_cnt check seq cnt );
merge want(in=a rename=(cnt=_cnt) drop=val out) want(in=b keep=id val out seq cnt rename=(seq=seq_id) where=(cnt='11'));
by id seq_id;
if a;
run;
data have;
input ID$ MAXVEL seq_id;
cards;
A 15 1
A 27 2
A 29 3
A 52 4
B 60 1
B 26 2
B 40 3
B 2 4
B 9 5
B 45 6
C 60 1
C 12 2
C 60 3
C 50 4
C 12 5
;
data want;
length cnt $10;
set have;
by id;
retain cnt val;
if first.id then do;cnt='';val=.;end;
if maxvel >25 then check='1';
if check ne '' then cnt=cats(check,cnt);
if check='' then cnt='';
if cnt in ('1','11') then do;
val+maxvel;
end;
seq=lag(seq_id);
if check='' then val=.;
if cnt='11' then val=val/2;
if val in (26:50) then out='Medium';
else if val in (51:60) then out='High';
run;
proc sort data=want;
by id seq_id ;
run;
data want2(drop=_cnt check seq cnt );
merge want(in=a rename=(cnt=_cnt) drop=val out) want(in=b keep=id val out seq cnt rename=(seq=seq_id) where=(cnt='11'));
by id seq_id;
if a;
run;
Hello. Thank you for the suggestion. I tested it and it works. Really appreciate it.
Another quick question - is there an easy fix if I have the following?
Link: https://communities.sas.com/t5/SAS-Data-Management/Starting-point-average/m-p/376087
It's almost similar to this one but there is one other constraint.
Really appreciate it!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.