BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
portneren
Calcite | Level 5

I am not that advanced in SAS programming and have currently run into some problems regarding how to call a specific value closest to a given value. The problem consits of two "subproblems"

 

We are looking at outcomes after a time period of 12, 26, 52, 104, 156, 208 and 260 weeks, we have acceptet an interval of +/- 4 weeks for the time points as the visit date is not always the exact date corresponding to the time period. This give rise to the first problem that some patients might multiple visits in the time frame. 

Here we want the datapoint closest to the time point. E.g. in the example provided for ID = 1, visit 2 (8 weeks), visit 3 (12 weeks), and visit 4 (14 weeks) all satisfy the condition of 12 weeks +/- 4 weeks. The one we would like to be our 12 weeks visit should be visit 3. 

 

idvisitDuration_weeksscore1score2
1101010
12875
131275
141455
152322
1626..
173122
2101012
2210..
231455
242633

 

Now the second problem is that we are looking at two different outcomes: score1 and score2. For some visits score1 and score2 is not recorded, in this scenario, if multiple visits are in the time period, we want to call the visit which have both scores recorded. and if only one then score1>score2. E.g. in the example provided for ID = 2 we want visit 3 to be the one called. 

 

 

I can code this with lead and lag functions (atleast the first part), however this is very time consuming and syntax heavy. Is there a smartere way to accomplish this?

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

I would start by finding the records in the correct intervals and giving them a rank by scores available and a calculated time difference:

data ordered;
  set have;
  do scheduled=12, 26, 52, 104, 156, 208,260;
    if scheduled-4<=Duration_weeks<=scheduled+4 then do;
      rank=missing(score1)*2+missing(score2); /* Score1 has higher priority, high rank is bad */
      diff=abs(scheduled-Duration_weeks);
      output;
      end;
    end;
run;

Then, it is very simple:

proc sort;
  by id scheduled rank diff;
run;

data want;
  set ordered;
  by id scheduled rank diff;
  if first.scheduled;
run;

If I read your specification correctly, you want to prioritize RANK over DIFF. If not, switch the two in the BY statements.

View solution in original post

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

I need a bit more information than this.. Does this mean you want 7 different data sets? One for each value of 12, 26, 52, 104, 156, 208 and 260?

 

 

s_lassen
Meteorite | Level 14

I would start by finding the records in the correct intervals and giving them a rank by scores available and a calculated time difference:

data ordered;
  set have;
  do scheduled=12, 26, 52, 104, 156, 208,260;
    if scheduled-4<=Duration_weeks<=scheduled+4 then do;
      rank=missing(score1)*2+missing(score2); /* Score1 has higher priority, high rank is bad */
      diff=abs(scheduled-Duration_weeks);
      output;
      end;
    end;
run;

Then, it is very simple:

proc sort;
  by id scheduled rank diff;
run;

data want;
  set ordered;
  by id scheduled rank diff;
  if first.scheduled;
run;

If I read your specification correctly, you want to prioritize RANK over DIFF. If not, switch the two in the BY statements.

portneren
Calcite | Level 5

Thank you, this is exactly what I needed - it works perfect. 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1595 views
  • 0 likes
  • 3 in conversation