BookmarkSubscribeRSS Feed
BlairWitty
Calcite | Level 5

Hello all,

 

I am working with clinical patient data for 375 infants. Each infant has multiple rows representing different medical visits in which certain measurements were taken (i.e. height and weight), each person has a different number of rows (total data set rows is about 17,000). Every patient has a unique ID, however these are not consecutive values, they are medical record numbers. I have sorted the data first by this patient id, then by the date of patient visit.

 

I need to extract the values from measurement fields when a patient's age is equal to 36 weeks, but because of the nature of this data many patients do not have a measurement taken at the precise time point I want, so I then need to extract the measurement from the visit when their age was closest to the desired age.

 

I am attempting to use a DO Loop to test the age variable for closeness to 36 weeks and then extract the values, but it is not working.

 

My goal in the below code is to compare age_in_weeks for each patient to 36 weeks, if the age is precisely equal to that value then I want the weight at that visit to be the value for a new variable weight_36_wks. If the age is NOT 36 weeks, I want to compare the age to +/- 1 week. If that age is still not captured I want the age_in_week_counter to accumulate by 1 and then do the comparison again.

 

When I run the code the accumulation piece does not seem to be working, does it matter that my index variables are not consecutive values? The counter adds one and then never adds again.

data test;
set metrics;
age_in_wk_counter = 1;
DO i = patient_id;
DO j = age_in_wks;
IF j = 36 then weight_36_wks = wt_grams;
ELSE IF j + age_in_wk_counter = 36 then weight_36_wks = wt_grams;
ELSE IF j - age_in_wk_counter = 36 then weight_36_wks = wt_grams;
ELSE weight_36_wks = .

age_in_wk_counter +1;
output;
end;
end;
run;
2 REPLIES 2
Kurt_Bremser
Super User
proc sql;
create table want as
  select
    patient_id,
    weight as wt_grams,
    age_in_wks,
    abs(36-age_in_wks) as dist
  from metrics
  group by patient_id
  having calculated dist = min(calculated dist)
;
quit;

Untested; for tested code, please supply example data in usable form (data step with datalines).

RichardDeVen
Barite | Level 11

SQL is very good for your task. 

 

Here is an alternative in DATA Step.

 

Q: Would it be more physiologically appropriate to compute an interpolated weight from the weights bounding week 36 if week 36 is not present ?

 

Example:

 

code - sample data generator

Spoiler
data have;
  call streaminit(1234);

  do patid = 1 to 375;
    length age_wks 8;
    wt_grams = rand('integer', 2250, 7500);
    do age_wks = 1 to 50;

* average wt gain from https://www.healthline.com/health/baby/baby-weight-gain;
wt_grams + ifn(age_wks<=16,rand('integer',160,180), ifn(age_wks<=26,rand('integer',113,150), ifn(age_wks<=52,rand('integer',57-113), .))); * skip data 45% of the time; if rand('uniform') < 0.45 then continue; * skip week 36 every 7th patient; if mod(patid,7) = 0 and age_wks = 36 then continue; * skip week 35 every 14th patient; if mod(patid,14) = 0 and age_wks = 35 then continue; if patid=20 and age_wks <= 40 then continue; output; end; end; run;

code - select data from last record approaching week 36, overridden by first record after week 36 if that is closer.

* create view with new categorical variable used in subsequent BY processing;

data have_v / view=have_v;
  set have;

  length wk_zone $6;
  wk_zone = ifc(age_wks<=36,'<=36','>36');
run;

* process each patients data, retaining the closest to week 36 information;

data want_36;
  set have_v;
  by patid wk_zone;

  retain wk36_weight wk36_age;

  if first.patid then call missing(wk36_weight, wk36_age);

  if last.wk_zone and wk_zone = '<=36' then do;
    wk36_weight = wt_grams;
    wk36_age = age_wks;
  end;

  if first.wk_zone and wk_zone = '>36' then do;
    if missing(wk36_age) or age_wks-36 < 36-wk36_age then do;
      wk36_weight = wt_grams;
      wk36_age = age_wks;
    end;
  end;

  if last.patid;

  keep patid wk36:;
run;

Results

RichardADeVenezia_0-1603237709091.png

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 779 views
  • 0 likes
  • 3 in conversation