Loop through several similar columns in dataset, output only if populated, and stack data vertically

Reply
New Contributor
Posts: 2

Loop through several similar columns in dataset, output only if populated, and stack data vertically

Hello,

 

I have a dataset with multiple “comments” fields which largely remain empty. I want to create a QA dataset which basically dumps each of these fields when populated, and reformats the data vertically.

 

Dataset looks like this:

 

PID, comment1, comment2, comment3

111,, ‘Not completed on this date.’

112,,,

113, ‘Comment comment’, ‘LTFU’,

114, ‘LTFU,, ‘No result as of 1/1/2018.’

 

I want to create a dataset which looks like this:

 

Comment,PID,FieldValue

Comment1, 113, ‘Comment comment’

Comment1, 114, ‘LTFU’

Comment2, 113, ‘LTFU’

Comment3, 111, ‘Not completed on this date.’

Comment3, 114, ‘No result as of 1/1/2018.’

 

As more of an sql-programmer, my back-up way to do this would be:

 

proc sql feedback;

create table want as

select

                ‘comment1’ as Comment,

                PID,

                Comment1 as fieldvalue

from work.have

where comment1<>’’

 

UNION ALL

 

select

                ‘comment2’ as Comment,

                PID,

                Comment2 as fieldvalue

from work.have

where comment2<>’’

 

…etc…

 

But there must be a more efficient (and no doubt blindingly basic) way to do this...

 

Thank you for your input/patience.

Super User
Posts: 6,785

Re: Loop through several similar columns in dataset, output only if populated, and stack data vertic

Mildly quick and easy would be to use arrays:

 

data want;

set have;

array comm {3} comment1-comment3;

do k=1 to 3;

   if comm{k} > ' ' then do;

      comment = vname(comm{k});

      FieldValue = comm{k};

      output;

   end;

end;

keep comment PID FieldValue;

run;

Super User
Super User
Posts: 8,125

Re: Loop through several similar columns in dataset, output only if populated, and stack data vertic

Use proc transpose.  Add a WHERE= condition on the output dataset.

proc transpose data=have 
  out=want (rename=(col1=comment) where=(not missing(comment)))
;
  by pid ;
  var comment1-comment3 ;
run;
New Contributor
Posts: 2

Re: Loop through several similar columns in dataset, output only if populated, and stack data vertic

Thank you both for your very quick and much more palatable solutions!

 

Quick follow-up question: Do my variables have to have numeric suffixes for these solutions to work? I.e., I used varnames "Comment1, Comment2, etc." as an example, but really they have names like "medication_spec", "ethnicity_spec", etc.

 

Thank you again! I can already see multiple used for both these sets of code in my work.

Super User
Super User
Posts: 8,125

Re: Loop through several similar columns in dataset, output only if populated, and stack data vertic

No.

You can use any list of variables in an ARRAY or VAR statement.

Ask a Question
Discussion stats
  • 4 replies
  • 77 views
  • 0 likes
  • 3 in conversation