Hello,
I have a dataset with multiple “comments” fields which largely remain empty. I want to create a QA dataset which basically dumps each of these fields when populated, and reformats the data vertically.
Dataset looks like this:
PID, comment1, comment2, comment3
111,, ‘Not completed on this date.’
112,,,
113, ‘Comment comment’, ‘LTFU’,
114, ‘LTFU,, ‘No result as of 1/1/2018.’
I want to create a dataset which looks like this:
Comment,PID,FieldValue
Comment1, 113, ‘Comment comment’
Comment1, 114, ‘LTFU’
Comment2, 113, ‘LTFU’
Comment3, 111, ‘Not completed on this date.’
Comment3, 114, ‘No result as of 1/1/2018.’
As more of an sql-programmer, my back-up way to do this would be:
proc sql feedback;
create table want as
select
‘comment1’ as Comment,
PID,
Comment1 as fieldvalue
from work.have
where comment1<>’’
UNION ALL
select
‘comment2’ as Comment,
PID,
Comment2 as fieldvalue
from work.have
where comment2<>’’
…etc…
But there must be a more efficient (and no doubt blindingly basic) way to do this...
Thank you for your input/patience.
Mildly quick and easy would be to use arrays:
data want;
set have;
array comm {3} comment1-comment3;
do k=1 to 3;
if comm{k} > ' ' then do;
comment = vname(comm{k});
FieldValue = comm{k};
output;
end;
end;
keep comment PID FieldValue;
run;
Use proc transpose. Add a WHERE= condition on the output dataset.
proc transpose data=have
out=want (rename=(col1=comment) where=(not missing(comment)))
;
by pid ;
var comment1-comment3 ;
run;
Thank you both for your very quick and much more palatable solutions!
Quick follow-up question: Do my variables have to have numeric suffixes for these solutions to work? I.e., I used varnames "Comment1, Comment2, etc." as an example, but really they have names like "medication_spec", "ethnicity_spec", etc.
Thank you again! I can already see multiple used for both these sets of code in my work.
No.
You can use any list of variables in an ARRAY or VAR statement.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.