I have dataset with multiple observations per subject.
Have:
ID | t | v |
1 | 30 | 2.95 |
1 | 22 | 5.11 |
1 | 23 | 2.33 |
1 | 21 | 4.3 |
1 | 28 | 2.28 |
I want to create a dataset to keep only one record per subject where the variable 't' closest to value 24.
want: | ||
ID | t | v |
1 | 23 | 2.33 |
Any help to do this in sas?
Thanks
Simplest way I can think of is to create an offset var, then sort by offset and take n=1:
data want; set have; offset=ifn(t<24,24-t,t-24); run; proc sort data=want; by offset; run; data want; set want (obs=1); run;
Not tested as not typing in test data - post as a datastep next time.
or this way:
data have;
input ID t v;
datalines;
1 30 2.95
1 22 5.11
1 23 2.33
1 21 4.3
1 28 2.28
;
run;
proc sql;
create table want as
select id, t, v
from have
group by id
having min(abs(24-t))=abs(24-t)
;
quit;
Since you indicate there are multiple IDs:
data int;
set have;
offset = abs(t-24);
run;
proc sort data=int;
by id offset;
run;
data want;
set int;
by id;
if first.id;
drop offset;
run;
For larger datasets, this might be more performant than doing it with SQL.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.