BookmarkSubscribeRSS Feed
BlairWitty
Calcite | Level 5

Hello all,

 

I am working with clinical patient data for 375 infants. Each infant has multiple rows representing different medical visits in which certain measurements were taken (i.e. height and weight), each person has a different number of rows (total data set rows is about 17,000). Every patient has a unique ID, however these are not consecutive values, they are medical record numbers. I have sorted the data first by this patient id, then by the date of patient visit.

 

I need to extract the values from measurement fields when a patient's age is equal to 36 weeks, but because of the nature of this data many patients do not have a measurement taken at the precise time point I want, so I then need to extract the measurement from the visit when their age was closest to the desired age.

 

I am attempting to use a DO Loop to test the age variable for closeness to 36 weeks and then extract the values, but it is not working.

 

My goal in the below code is to compare age_in_weeks for each patient to 36 weeks, if the age is precisely equal to that value then I want the weight at that visit to be the value for a new variable weight_36_wks. If the age is NOT 36 weeks, I want to compare the age to +/- 1 week. If that age is still not captured I want the age_in_week_counter to accumulate by 1 and then do the comparison again.

 

When I run the code the accumulation piece does not seem to be working, does it matter that my index variables are not consecutive values? The counter adds one and then never adds again.

data test;
set metrics;
age_in_wk_counter = 1;
DO i = patient_id;
DO j = age_in_wks;
IF j = 36 then weight_36_wks = wt_grams;
ELSE IF j + age_in_wk_counter = 36 then weight_36_wks = wt_grams;
ELSE IF j - age_in_wk_counter = 36 then weight_36_wks = wt_grams;
ELSE weight_36_wks = .

age_in_wk_counter +1;
output;
end;
end;
run;
2 REPLIES 2
Kurt_Bremser
Super User
proc sql;
create table want as
  select
    patient_id,
    weight as wt_grams,
    age_in_wks,
    abs(36-age_in_wks) as dist
  from metrics
  group by patient_id
  having calculated dist = min(calculated dist)
;
quit;

Untested; for tested code, please supply example data in usable form (data step with datalines).

RichardDeVen
Barite | Level 11

SQL is very good for your task. 

 

Here is an alternative in DATA Step.

 

Q: Would it be more physiologically appropriate to compute an interpolated weight from the weights bounding week 36 if week 36 is not present ?

 

Example:

 

code - sample data generator

Spoiler
data have;
  call streaminit(1234);

  do patid = 1 to 375;
    length age_wks 8;
    wt_grams = rand('integer', 2250, 7500);
    do age_wks = 1 to 50;

* average wt gain from https://www.healthline.com/health/baby/baby-weight-gain;
wt_grams + ifn(age_wks<=16,rand('integer',160,180), ifn(age_wks<=26,rand('integer',113,150), ifn(age_wks<=52,rand('integer',57-113), .))); * skip data 45% of the time; if rand('uniform') < 0.45 then continue; * skip week 36 every 7th patient; if mod(patid,7) = 0 and age_wks = 36 then continue; * skip week 35 every 14th patient; if mod(patid,14) = 0 and age_wks = 35 then continue; if patid=20 and age_wks <= 40 then continue; output; end; end; run;

code - select data from last record approaching week 36, overridden by first record after week 36 if that is closer.

* create view with new categorical variable used in subsequent BY processing;

data have_v / view=have_v;
  set have;

  length wk_zone $6;
  wk_zone = ifc(age_wks<=36,'<=36','>36');
run;

* process each patients data, retaining the closest to week 36 information;

data want_36;
  set have_v;
  by patid wk_zone;

  retain wk36_weight wk36_age;

  if first.patid then call missing(wk36_weight, wk36_age);

  if last.wk_zone and wk_zone = '<=36' then do;
    wk36_weight = wt_grams;
    wk36_age = age_wks;
  end;

  if first.wk_zone and wk_zone = '>36' then do;
    if missing(wk36_age) or age_wks-36 < 36-wk36_age then do;
      wk36_weight = wt_grams;
      wk36_age = age_wks;
    end;
  end;

  if last.patid;

  keep patid wk36:;
run;

Results

RichardADeVenezia_0-1603237709091.png

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1314 views
  • 0 likes
  • 3 in conversation