Hi,
I am working with a dataset that looks like this. Each participant was scored multiple times and age at exam was recorded.
id visit age score
1 1 80 30
1 2 85 28
1 3 86 25
2 1 84 25
2 2 87 24
3 1 90 28
3 2 95 28
I want to create a new variable that reports the score (score_new) closest to age 85.
Particularly, I want the score at age 85. If there is no value for this, then I want the score at an age +/- 2 years, using the score closest to age 85. If there is no value for this either, then I want the score recorded after age 85, again using the score closest to 85. If there is a tie (e.g., a score at 84 and 86), then I want to keep the younger score. Remaining can be missing.
So, I would like to end with a final data set like this:
id visit age score score_new
1 1 80 30 .
1 2 85 28 28
1 3 86 25 .
2 1 84 25 25
2 2 86 24 .
3 1 90 28 28
3 2 95 28 .
Any help would be appreciated. Happy to explain further. I am new to arrays, so replies with detail would be best.
Looking forward to your feedback and solving this!
Thanks.
Ok. Try
proc sql;
create table want as
select a.*,
case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from
have as a left join
closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
That should fix it.
Here is a simple way:
data closenessRanks;
input age rank;
datalines;
85 0
84 1
86 2
83 3
87 4
88 5
89 6
90 7
91 8
92 9
93 10
94 11
95 12
96 13
97 14
98 15
99 16
;
data have;
input id visit age score;
datalines;
1 1 80 30
1 2 85 28
1 3 86 25
2 1 84 25
2 2 87 24
3 1 90 28
3 2 95 28
;
proc sql;
create table want as
select a.*,
case when rank=min(rank) then score else . end as newScore
from
have as a left join
closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
Thank you so much! This works, but there's one small issue to fix. For individuals where none of the scores match one of the ranked ages, the original score is being used as newscore. Instead, I would like these to be missing. How can I change this? Thank you.
Ok. Try
proc sql;
create table want as
select a.*,
case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from
have as a left join
closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
That should fix it.
Hi PG,
I found one case when this doesn't work -- when the first visit is missing, and there are values for subsequent visits. Code ends up not pulling any value for new score, when it should. Can't figure out how to change this. Would appreciate a solution for this. Thank you in advance!
The problem could be with the way you read your data. Compare the results with and without the infile statement in
data closenessRanks;
input age rank;
datalines;
85 0
84 1
86 2
83 3
87 4
88 5
89 6
90 7
91 8
92 9
93 10
94 11
95 12
96 13
97 14
98 15
99 16
;
data have;
infile datalines missover;
input id visit age score;
datalines;
1 1 80 30
1 2 85 28
1 3 86 25
2 1 84 25
2 2 87 24
3 1 90 28
3 2 95 28
4 1 79 88
4 2 80 99
5 1 80
5 2 85 28
5 3 86 25
;
proc sql;
create table want as
select a.*,
case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from
have as a left join
closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
Maybe I understood what you are talking about.
data have;
input id visit age score;
cards;
1 1 80 30
1 2 85 28
1 3 86 25
2 1 84 25
2 2 87 24
3 1 90 28
3 2 95 28
;
run;
proc sql;
create table temp1 as
select *
from have
group by id
having sum(age=85)=0 and age in (83 84 86 87)
order by id,age;
create table temp2 as
select *
from have
group by id
having sum(age in (83 84 85 86 87))=0 and age gt 87
order by id,age;
quit;
data key1;
set temp1;
by id;
if first.id;
run;
data key2;
set temp2;
by id;
if first.id;
run;
data key;
set have(where=(age=85)) key1 key2;
keep id age score;
rename score=new_score;
run;
data want;
merge have key;
by id age;
run;
Thank you very much!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.