Hello,
I have a dataset that has subject IDs as one variable then a few other variables. I need to duplicate those variables and IDs so that I have 5 copies of each. I also need to create a new variable, for example a variable named Color and it has 5 options, Blue, Red, Green, Yellow, and Purple and I need one row for each subject for each color. So it would look something like:
ID Name Date Color
1 Sam 1/1/2020 Blue
1 Sam 1/1/2020 Red
1 Sam 1/1/2020 Green
1 Sam 1/1/2020 Yellow
1 Sam 1/1/2020 Purple
and then repeat for each other subject and their name and date etc.
I thought I could do this with an array but I can't figure out how.
I was thinking something like:
data new;
set date;
array colors(5) $ Blue $ Red $ Green $ Yellow $ Purple;
do i = 1 to 5;
but I don't know, maybe you can only use a list of variables in arrays and not the values for a variable?
If anyone has ideas on how to do this properly I'd really appreciate it.
Thank you!
The initial value of array statement is specified in ().
data have;
length id 8 name $10 date 8 color $10;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
;
run;
data want(drop=i);
array acl{5} $10 _temporary_ ('Blue','Red','Green','Yellow','Purple') ;
set have;
do i=1 to 5;
color=acl{i};
output;
end;
run;
The initial value of array statement is specified in ().
data have;
length id 8 name $10 date 8 color $10;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
;
run;
data want(drop=i);
array acl{5} $10 _temporary_ ('Blue','Red','Green','Yellow','Purple') ;
set have;
do i=1 to 5;
color=acl{i};
output;
end;
run;
You can try the following way:
data have;
length id 8 name $10 date 8;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
run;
data colors;
input color $ @@;
datalines;
Blue Red Green Yellow Purple
run;
data want;
set have;
do i=1 to nobs;
set colors point=i nobs=nobs;
output;
end;
run;
This is one of those tasks where it's worth remembering the utility of using comma-separated "ranges" in a DO loop:
data have;
length id 8 name $10 date 8;
input ID name date DDMMYY10.;
format date DDMMYY10.;
datalines;
1 Sam 1/1/2020
2 Tom 1/2/2020
run;
data want;
set have ;
length color $10;
do color= 'Blue','Red','Green','Yellow','Purple';
output;
end;
run;
I said comma-separated ranges instead of comma-separated values, because constructs like the below are valid when using numeric do loop indexes:
do i=1,3,6,9 to 11,15 to 12 by -1,4,2;
...
end;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.