BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
saslove
Quartz | Level 8

I have a dataset that looks like this 

 

currentstudy Study1ID     Study2ID        Study3ID           Study4ID

001                  180            780                    278                   560

002                                    790                                             520

003                                                              231

 

As you can see, each patient can be enrolled on multiple other studies. 

 

What I need is a dataset that looks like below.

Basically, concatenate the column names/study names of the other studies that same patient is enrolled in.

 

currentstudy  Other_enrolled_studies 

001                 Study1,Study2,Study3,Study4

002                 Study2,Study4

003                 Study 3

 

and so on for several hundred patients across multiple studies. 

 

Thanks very much,

Please help. 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data want;
set have;

length studies $200;

array study(*) study: ;

do i=1 to dim(study);

if not missing(study(i)) then do;
      studies = catx(", ", studies, vname(study(i)));
end;

run;

1. Declare an array of all studys

2. Loop through and find non missing entries

3. Find variable name (vname)

4. append to list of 'studies'

 

 


@saslove wrote:

I have a dataset that looks like this 

 

currentstudy Study1ID     Study2ID        Study3ID           Study4ID

001                  180            780                    278                   560

002                                    790                                             520

003                                                              231

 

As you can see, each patient can be enrolled on multiple other studies. 

 

What I need is a dataset that looks like below.

Basically, concatenate the column names/study names of the other studies that same patient is enrolled in.

 

currentstudy  Other_enrolled_studies 

001                 Study1,Study2,Study3,Study4

002                 Study2,Study4

003                 Study 3

 

and so on for several hundred patients across multiple studies. 

 

Thanks very much,

Please help. 


 

View solution in original post

6 REPLIES 6
Reeza
Super User
data want;
set have;

length studies $200;

array study(*) study: ;

do i=1 to dim(study);

if not missing(study(i)) then do;
      studies = catx(", ", studies, vname(study(i)));
end;

run;

1. Declare an array of all studys

2. Loop through and find non missing entries

3. Find variable name (vname)

4. append to list of 'studies'

 

 


@saslove wrote:

I have a dataset that looks like this 

 

currentstudy Study1ID     Study2ID        Study3ID           Study4ID

001                  180            780                    278                   560

002                                    790                                             520

003                                                              231

 

As you can see, each patient can be enrolled on multiple other studies. 

 

What I need is a dataset that looks like below.

Basically, concatenate the column names/study names of the other studies that same patient is enrolled in.

 

currentstudy  Other_enrolled_studies 

001                 Study1,Study2,Study3,Study4

002                 Study2,Study4

003                 Study 3

 

and so on for several hundred patients across multiple studies. 

 

Thanks very much,

Please help. 


 

saslove
Quartz | Level 8

That worked very well. Thanks Reeza

Jagadishkatam
Amethyst | Level 16

An alternative way

 

data want;
set have;
array var1(*) Study1ID Study2ID Study3ID Study4ID;
array var2(*)$ Study1IDc Study2IDc Study3IDc Study4IDc;
do i = 1 to dim(var1);
if var1(i) ne . then var2(i)=vname(var1(i));
newvar=catx(',',of var2(*));
end;
run;
Thanks,
Jag
Reeza
Super User

@Jagadishkatam wrote:

An alternative way

 

data want;
set have;
array var1(*) Study1ID Study2ID Study3ID Study4ID;
array var2(*)$ Study1IDc Study2IDc Study3IDc Study4IDc;
do i = 1 to dim(var1);
if var1(i) ne . then var2(i)=vname(var1(i));
newvar=catx(',',of var2(*));
end;
run;

@Jagadishkatam should the newer line be outside of the loop?

Jagadishkatam
Amethyst | Level 16

@Reeza , thanks for your respons. I tested the code on sashelp.class and seem like it is working. Appreciate your thoughts.

 

data want;
set sashelp.class;
array var1(*) age height weight ;
array var2(*)$ agec heightc weightc;
do i = 1 to dim(var1);
if var1(i) ne . then var2(i)=vname(var1(i));
newvar=catx(',',of var2(*));
end;
run;
Thanks,
Jag
Reeza
Super User
Technically it works, because it gets replaced at each iteration of the I loop, including the last, which is the same as doing it after the loop is finished. It's just not really logical to do it when it's not needed. If the data set is small this has no impact, if the data set is larger it's an inefficiency that slows things down.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4099 views
  • 2 likes
  • 3 in conversation