I am trying to compute an average within each ID group but with constraints.
For example, I would like to start taking average of two cells that are continuously ABOVE 25 and assign it a value.
- if the average is between 26-50 then 'Medium'
- if the average is between 51-60 then 'High'
If the average needs to start for the FIRST value after 25.
Here's the dataset:
| ID | MAXVEL | seq_id |
| A | 15 | 1 |
| A | 27 | 2 |
| A | 29 | 3 |
| A | 52 | 4 |
| B | 60 | 1 |
| B | 26 | 2 |
| B | 40 | 3 |
| B | 2 | 4 |
| B | 9 | 5 |
| B | 45 | 6 |
| C | 60 | 1 |
| C | 12 | 2 |
| C | 60 | 3 |
| C | 50 | 4 |
| C | 12 | 5 |
Output:
For example, starting point for A group will be at 27 as 15 is less than 26
| ID | MAXVEL | seq_id | Average | Output |
| A | 15 | 1 | ||
| A | 27 | 2 | 28 | Medium |
| A | 29 | 3 | ||
| A | 52 | 4 | ||
| B | 60 | 1 | 43 | Medium |
| B | 26 | 2 | ||
| B | 40 | 3 | ||
| B | 2 | 4 | ||
| B | 9 | 5 | ||
| B | 45 | 6 | ||
| C | 60 | 1 | ||
| C | 12 | 2 | ||
| C | 60 | 3 | 55 | High |
| C | 50 | 4 | ||
| C | 12 | 5 |
<
Thank you! 🙂
Any help would be greatly appreciated!
data have;
input ID$ MAXVEL seq_id;
cards;
A 15 1
A 27 2
A 29 3
A 52 4
B 60 1
B 26 2
B 40 3
B 2 4
B 9 5
B 45 6
C 60 1
C 12 2
C 60 3
C 50 4
C 12 5
;
data want;
length cnt $10;
set have;
by id;
retain cnt val;
if first.id then do;cnt='';val=.;end;
if maxvel >25 then check='1';
if check ne '' then cnt=cats(check,cnt);
if check='' then cnt='';
if cnt in ('1','11') then do;
val+maxvel;
end;
seq=lag(seq_id);
if check='' then val=.;
if cnt='11' then val=val/2;
if val in (26:50) then out='Medium';
else if val in (51:60) then out='High';
run;
proc sort data=want;
by id seq_id ;
run;
data want2(drop=_cnt check seq cnt );
merge want(in=a rename=(cnt=_cnt) drop=val out) want(in=b keep=id val out seq cnt rename=(seq=seq_id) where=(cnt='11'));
by id seq_id;
if a;
run;
data have;
input ID$ MAXVEL seq_id;
cards;
A 15 1
A 27 2
A 29 3
A 52 4
B 60 1
B 26 2
B 40 3
B 2 4
B 9 5
B 45 6
C 60 1
C 12 2
C 60 3
C 50 4
C 12 5
;
data want;
length cnt $10;
set have;
by id;
retain cnt val;
if first.id then do;cnt='';val=.;end;
if maxvel >25 then check='1';
if check ne '' then cnt=cats(check,cnt);
if check='' then cnt='';
if cnt in ('1','11') then do;
val+maxvel;
end;
seq=lag(seq_id);
if check='' then val=.;
if cnt='11' then val=val/2;
if val in (26:50) then out='Medium';
else if val in (51:60) then out='High';
run;
proc sort data=want;
by id seq_id ;
run;
data want2(drop=_cnt check seq cnt );
merge want(in=a rename=(cnt=_cnt) drop=val out) want(in=b keep=id val out seq cnt rename=(seq=seq_id) where=(cnt='11'));
by id seq_id;
if a;
run;
Hello. Thank you for the suggestion. I tested it and it works. Really appreciate it.
Another quick question - is there an easy fix if I have the following?
Link: https://communities.sas.com/t5/SAS-Data-Management/Starting-point-average/m-p/376087
It's almost similar to this one but there is one other constraint.
Really appreciate it!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.