Hello,
I have a dataset that has subject IDs as one variable then a few other variables. I need to duplicate those variables and IDs so that I have 5 copies of each. I also need to create a new variable, for example a variable named Color and it has 5 options, Blue, Red, Green, Yellow, and Purple and I need one row for each subject for each color. So it would look something like:
ID Name Date Color
1 Sam 1/1/2020 Blue
1 Sam 1/1/2020 Red
1 Sam 1/1/2020 Green
1 Sam 1/1/2020 Yellow
1 Sam 1/1/2020 Purple
and then repeat for each other subject and their name and date etc.
I thought I could do this with an array but I can't figure out how.
I was thinking something like:
data new;
set date;
array colors(5) $ Blue $ Red $ Green $ Yellow $ Purple;
do i = 1 to 5;
but I don't know, maybe you can only use a list of variables in arrays and not the values for a variable?
If anyone has ideas on how to do this properly I'd really appreciate it.
Thank you!
The initial value of array statement is specified in ().
data have;
length id 8 name $10 date 8 color $10;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
;
run;
data want(drop=i);
array acl{5} $10 _temporary_ ('Blue','Red','Green','Yellow','Purple') ;
set have;
do i=1 to 5;
color=acl{i};
output;
end;
run;
The initial value of array statement is specified in ().
data have;
length id 8 name $10 date 8 color $10;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
;
run;
data want(drop=i);
array acl{5} $10 _temporary_ ('Blue','Red','Green','Yellow','Purple') ;
set have;
do i=1 to 5;
color=acl{i};
output;
end;
run;
You can try the following way:
data have;
length id 8 name $10 date 8;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
run;
data colors;
input color $ @@;
datalines;
Blue Red Green Yellow Purple
run;
data want;
set have;
do i=1 to nobs;
set colors point=i nobs=nobs;
output;
end;
run;
This is one of those tasks where it's worth remembering the utility of using comma-separated "ranges" in a DO loop:
data have;
length id 8 name $10 date 8;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
run;
data want;
set have ;
length color $10;
do color= 'Blue','Red','Green','Yellow','Purple';
output;
end;
run;
I said comma-separated ranges instead of comma-separated values, because constructs like the below are valid when using numeric do loop indexes:
do i=1,3,6,9 to 11,15 to 12 by -1,4,2;
...
end;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.