Starting data
data have;
input id $ quarter $ col1 $ col2 $ col3 $ col4 $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;
------------------------------------------------------------------------------
I want to concatenate all permutations of each id/quarter columns. For example, the final result should be:
data want;
length id $4 quarter $10 col1 $10 col2 $10 col3 $10 col4 $10 want1 $25 want2 $25 want3 $25 want4 $25 want5 $25 want6 $25 want7 $25 want8 $25 want9 $25 want10 $25 want11 $25;
input id $ quarter $ col1 $ col2 $ col3 $ col4 $ want1 $ want2 $ want3 $ want4 want5 $ want6 $ want7 $ want8 $ want9 $ want10 $ want11 $;
datalines;
LF first scooter car bus foot scooter/car scooter/bus scooter/foot scooter/car/bus scooter/car/foot scooter/bus/foot scooter/car/bus/foot car/bus car/foot car/bus/foot bus/foot
LF second bike bus car . bike/bus bike/car bike/bus/car bus/car . . . . . . .
;
I have enough grasp of SAS to be able to capture some permutations such as concatenating col1+col2; col1+col3... col2+col3...and finally col1-col4. But my grasp isn't enough to be able to capture all possible permutations (such as col1+col2+col4).
Any help is appreciated!
You want combinations, not permutations. Here is how to get them, while eliminating those with missing values:
data have;
input id $ quarter $ a $ b $ c $ d $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;
data want;
set have;
array _a a--d;
do i = 1 to dim(_a);
do j = 1 to comb(dim(_a), i);
call allcomb(j, i, of _a{*});
length comb $64;
call missing (comb);
do k = 1 to i;
if missing(_a{k}) then leave;
comb = catx("/", comb, _a{k});
end;
if k > i then output;
end;
end;
keep id quarter comb;
run;
proc print data=want noobs; run;
id quarter comb LF first scooter LF first foot LF first bus LF first car LF first car/bus LF first car/scooter LF first car/foot LF first bus/foot LF first bus/scooter LF first foot/scooter LF first foot/scooter/bus LF first foot/scooter/car LF first foot/bus/car LF first scooter/bus/car LF first scooter/bus/car/foot LF second bike LF second car LF second bus LF second bus/car LF second bus/bike LF second car/bike LF second bike/car/bus
Is this a programming exercise or is the output actually going to be used for something? If it is to be used it might be a good idea to describe what the intended use actually would be.
For entertainment you can look and the documentation for ALLPERM and CALL ALLPERM functions. And you may want to be careful of wanting Permutation or Combinations (ALLCOMB, CALL ALLCOMB and CALL ALLCOMBI). Permutations consider order as well, so if you have A and B you get A,B and B,A as results.
Or the the Lexcomb versions because you apparently will have missing values for some.
It's not an exercise. Once I know all the possible combinations of transportation a person could have in a quarter, I'll need to transform it into long-format and merge on another dataset. I'll definitely look into some of the functions you mentioned. Thank you.
@LFern wrote:
It's not an exercise. Once I know all the possible combinations of transportation a person could have in a quarter, I'll need to transform it into long-format and merge on another dataset. I'll definitely look into some of the functions you mentioned. Thank you.
At least in my mind, what @ballardw was talking about was what ANALYSIS will you be doing when you complete this creation of permutations. Saying you're going to merge the permutations with another data set really isn't what we were looking for. The reason I make this point is because the ANALYSIS you will be doing may (or may not) be easier with the data arranged in some other fashion.
You want combinations, not permutations. Here is how to get them, while eliminating those with missing values:
data have;
input id $ quarter $ a $ b $ c $ d $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;
data want;
set have;
array _a a--d;
do i = 1 to dim(_a);
do j = 1 to comb(dim(_a), i);
call allcomb(j, i, of _a{*});
length comb $64;
call missing (comb);
do k = 1 to i;
if missing(_a{k}) then leave;
comb = catx("/", comb, _a{k});
end;
if k > i then output;
end;
end;
keep id quarter comb;
run;
proc print data=want noobs; run;
id quarter comb LF first scooter LF first foot LF first bus LF first car LF first car/bus LF first car/scooter LF first car/foot LF first bus/foot LF first bus/scooter LF first foot/scooter LF first foot/scooter/bus LF first foot/scooter/car LF first foot/bus/car LF first scooter/bus/car LF first scooter/bus/car/foot LF second bike LF second car LF second bus LF second bus/car LF second bus/bike LF second car/bike LF second bike/car/bus
Hello, thank you so much! This perfectly executes what I wanted. However, I'm new to arrays and would like to understand your code more. Why did you select "99" here? I changed it to a couple of other numbers but didn't notice a change in the final output.
array _b{99} $8 _temporary_;
Thank you again.
I took out that statement. It defined a temporary array that wasn't needed in the end. You would use it if you wanted to sort the list elements before concatenating them.
About the statement
array _b{99} $8 _temporary_;
The value 99 is meant to be greater than dim(_a), because array _b was intended to hold a copy of array _a. Ideally I would like to write
array _b{dim(_a)} $8 _temporary_;
but unfortunately SAS only allows constant numbers and "*" as array dimensions.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.