it seems to me that you tried to filled up missing demographic info for patient, and for each patient it can only one value if not missing. For instance, a patient (same CPI) can not be both male or female at the same time. therefore my simple Proc SQL approach should have a shot:
DATA HAVE;
INPUT CPI :$3. (EnrollDate VisitDate) (:MMDDYY10.) (BP1 Race Gender) (:$10.);
CARDS;
123 1/1/2001 1/1/2001 120/80 White Female
123 1/1/2001 2/1/2001 119/85 . .
123 9/9/2009 9/9/2009 115/75 White .
123 9/9/2009 10/9/2009 . . .
456 2/2/2002 2/2/2002 118/85 Black Male
456 2/2/2002 3/2/2002 110/82 Black Male
789 3/3/2003 3/3/2003 135/81 Hispanic .
789 3/3/2003 4/3/2003 132/83 Hispanic Male
789 3/3/2003 5/3/2003 . . Male
789 3/3/2003 6/3/2003 128/81 . .
;
proc sql;
create table want as
select CPI, EnrollDate FORMAT=MMDDYY10., VisitDate FORMAT=MMDDYY10., BP1,
COALESCEC(RACE, MAX(RACE)) AS RACE,
COALESCEC(GENDER, MAX(GENDER)) AS GENDER
FROM HAVE
GROUP BY CPI
;
QUIT;
Note: coalescec() is abundant in your case, Max() alone is sufficient .
Haikuo
For demographic variables I would probably create a DM data with one obs per CPI and not LOCF those variables. Here is UPDATE method LOCF and NOCB(next obs carried back?). This method has the advantage of LOCFing a lot of variables without a lot of code. There is also the assumption that any completely missing visits have been added to the data.
data _null_,
Can you explain, please, syntax of operation array f first:; ?
I mean for what first:; stands?
Katie,
Here is an alternative approach that you might want to try:
data have;
input CPI (EnrollDate VisitDate) (:mmddyy10.) (BP1 Race Gender) ($);
cards;
123 1/1/2001 1/1/2001 120/80 White Female
123 1/1/2001 2/1/2001 119/85 . .
123 9/9/2009 9/9/2009 115/75 White .
123 9/9/2009 10/9/2009 . . .
456 2/2/2002 2/2/2002 118/85 Black Male
456 2/2/2002 3/2/2002 110/82 Black Male
789 3/3/2003 3/3/2003 135/81 Hispanic .
789 3/3/2003 4/3/2003 132/83 Hispanic Male
789 3/3/2003 5/3/2003 . . Male
789 3/3/2003 6/3/2003 128/81 . .
;
data want (drop=_:);
length _race _gender $8;
do until (last.cpi);
set have;
by cpi;
race=coalescec(race,_race);
gender=coalescec(gender,_gender);
_race=race;
_gender=gender;
end;
do until (last.cpi);
set have (drop=race gender);
by cpi;
output;
end;
run;
Thank you all so much for your help. I believe I've gotten it to work!
Thanks again!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.