BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pamplemouse22
Calcite | Level 5

Hi, 

 

I am working with a dataset that looks like this. Each participant was scored multiple times and age at exam was recorded. 

 

id visit age score

1 1 80 30      

1 2 85 28

1 3 86 25

2 1 84 25 

2 2 87 24

3 1 90 28

3 2 95 28

 

I want to create a new variable that reports the score (score_new) closest to age 85.

 

Particularly, I want the score at age 85. If there is no value for this, then I want the score at an age +/- 2 years, using the score closest to age 85. If there is no value for this either, then I want the score recorded after age 85, again using the score closest to 85. If there is a tie (e.g., a score at 84 and 86), then I want to keep the younger score. Remaining can be missing. 

 

So, I would like to end with a final data set like this:

 

id visit age score score_new

1 1 80 30 .        

1 2 85 28 28

1 3 86 25 .

2 1 84 25 25 

2 2 86 24 .

3 1 90 28 28

3 2 95 28 .

 

Any help would be appreciated. Happy to explain further. I am new to arrays, so replies with detail would be best. 

 

Looking forward to your feedback and solving this!

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Ok. Try

 

proc sql;
create table want as
select a.*,
    case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from 
    have as a left join 
    closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;

That should fix it.

PG

View solution in original post

8 REPLIES 8
PGStats
Opal | Level 21

Here is a simple way:

 

data closenessRanks;
input age rank;
datalines;
85 0
84 1
86 2
83 3
87 4
88 5
89 6
90 7
91 8
92 9
93 10
94 11
95 12
96 13
97 14
98 15
99 16
;

data have;
input id visit age score;
datalines;
1 1 80 30      
1 2 85 28
1 3 86 25
2 1 84 25 
2 2 87 24
3 1 90 28
3 2 95 28
;

proc sql;
create table want as
select a.*,
    case when rank=min(rank) then score else . end as newScore
from 
    have as a left join 
    closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
PG
pamplemouse22
Calcite | Level 5

Thank you so much! This works, but there's one small issue to fix. For individuals where none of the scores match one of the ranked ages, the original score is being used as newscore. Instead, I would like these to be missing. How can I change this? Thank you. 

 

 

PGStats
Opal | Level 21

Ok. Try

 

proc sql;
create table want as
select a.*,
    case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from 
    have as a left join 
    closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;

That should fix it.

PG
pamplemouse22
Calcite | Level 5
Yes, it does! Thank you.
pamplemouse22
Calcite | Level 5

Hi PG, 

I found one case when this doesn't work -- when the first visit is missing, and there are values for subsequent visits. Code ends up not pulling any value for new score, when it should. Can't figure out how to change this. Would appreciate a solution for this. Thank you in advance! 

PGStats
Opal | Level 21

The problem could be with the way you read your data. Compare the results with and without the infile statement in

 

data closenessRanks;
input age rank;
datalines;
85 0
84 1
86 2
83 3
87 4
88 5
89 6
90 7
91 8
92 9
93 10
94 11
95 12
96 13
97 14
98 15
99 16
;

data have;
infile datalines missover;
input id visit age score;
datalines;
1 1 80 30      
1 2 85 28
1 3 86 25
2 1 84 25 
2 2 87 24
3 1 90 28
3 2 95 28
4 1 79 88
4 2 80 99
5 1 80 
5 2 85 28
5 3 86 25
;

proc sql;
create table want as
select a.*,
    case when coalesce(rank, -1)=min(rank) then score else . end as newScore
from 
    have as a left join 
    closenessRanks as b on a.age=b.age
group by id
order by id, visit;
select * from want;
quit;
PG
Ksharp
Super User

Maybe I understood what you are talking about.

 

data have;
input id visit age score;
cards;
1 1 80 30      
1 2 85 28
1 3 86 25
2 1 84 25 
2 2 87 24
3 1 90 28
3 2 95 28
;
run;
proc sql;
create table temp1 as
 select *
  from have
   group by id
    having sum(age=85)=0 and age in (83 84 86 87)
     order by id,age;
create table temp2 as
 select *
  from have
   group by id
    having sum(age in (83 84 85 86 87))=0 and age gt 87
     order by id,age;
quit;
data key1;
 set temp1;
 by id;
 if first.id;
run;
data key2;
 set temp2;
 by id;
 if first.id;
run;
data key;
 set have(where=(age=85)) key1 key2;
 keep id age score;
 rename score=new_score;
run;

data want;
 merge have key;
 by id age;
run;
pamplemouse22
Calcite | Level 5

Thank you very much!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 4125 views
  • 0 likes
  • 3 in conversation