BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LFern
Obsidian | Level 7

Starting data

data have;
input id $ quarter $ col1 $ col2 $ col3 $ col4 $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;

 

------------------------------------------------------------------------------

I want to concatenate all permutations of each id/quarter columns. For example, the final result should be: 

 

data want;
length id $4 quarter $10 col1 $10 col2 $10 col3 $10 col4 $10 want1 $25 want2 $25 want3 $25 want4 $25 want5 $25 want6 $25 want7 $25 want8 $25 want9 $25 want10 $25 want11 $25;
input id $ quarter $ col1 $ col2 $ col3 $ col4 $ want1 $ want2 $ want3 $ want4 want5 $ want6 $ want7 $ want8 $ want9 $ want10 $ want11 $;
datalines;
LF first scooter car bus foot scooter/car scooter/bus scooter/foot scooter/car/bus scooter/car/foot scooter/bus/foot scooter/car/bus/foot car/bus car/foot car/bus/foot bus/foot
LF second bike bus car . bike/bus bike/car bike/bus/car bus/car . . . . . . .
; 

I have enough grasp of SAS to be able to capture some permutations such as concatenating col1+col2; col1+col3... col2+col3...and finally col1-col4. But my grasp isn't enough to be able to capture all possible permutations (such as col1+col2+col4).

 

Any help is appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You want combinations, not permutations. Here is how to get them, while eliminating those with missing values:

 

data have;
input id $ quarter $ a $ b $ c $ d $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;

data want;
set have;
array _a a--d;
do i = 1 to dim(_a);
    do j = 1 to comb(dim(_a), i);
        call allcomb(j, i, of _a{*});
        length comb $64;
        call missing (comb);
        do k = 1 to i;
            if missing(_a{k}) then leave;
            comb = catx("/", comb, _a{k});
            end;
        if k > i then output;
        end;
    end;
keep id quarter comb;
run;

proc print data=want noobs; run;
id 	quarter 	comb
LF 	first 	scooter
LF 	first 	foot
LF 	first 	bus
LF 	first 	car
LF 	first 	car/bus
LF 	first 	car/scooter
LF 	first 	car/foot
LF 	first 	bus/foot
LF 	first 	bus/scooter
LF 	first 	foot/scooter
LF 	first 	foot/scooter/bus
LF 	first 	foot/scooter/car
LF 	first 	foot/bus/car
LF 	first 	scooter/bus/car
LF 	first 	scooter/bus/car/foot
LF 	second 	bike
LF 	second 	car
LF 	second 	bus
LF 	second 	bus/car
LF 	second 	bus/bike
LF 	second 	car/bike
LF 	second 	bike/car/bus
PG

View solution in original post

6 REPLIES 6
ballardw
Super User

Is this a programming exercise or is the output actually going to be used for something? If it is to be used it might be a good idea to describe what the intended use actually would be.

 

For entertainment you can look and the documentation for ALLPERM and CALL ALLPERM functions. And you may want to be careful of wanting Permutation or Combinations (ALLCOMB, CALL ALLCOMB and CALL ALLCOMBI). Permutations consider order as well, so if you have A and B you get A,B and B,A as results.

Or the the Lexcomb versions because you apparently will have missing values for some.

LFern
Obsidian | Level 7

It's not an exercise. Once I know all the possible combinations of transportation a person could have in a quarter, I'll need to transform it into long-format and merge on another dataset. I'll definitely look into some of the functions you mentioned. Thank you.

PaigeMiller
Diamond | Level 26

@LFern wrote:

It's not an exercise. Once I know all the possible combinations of transportation a person could have in a quarter, I'll need to transform it into long-format and merge on another dataset. I'll definitely look into some of the functions you mentioned. Thank you.


At least in my mind, what @ballardw was talking about was what ANALYSIS will you be doing when you complete this creation of permutations. Saying you're going to merge the permutations with another data set really isn't what we were looking for. The reason I make this point is because the ANALYSIS you will be doing may (or may not) be easier with the data arranged in some other fashion.

--
Paige Miller
PGStats
Opal | Level 21

You want combinations, not permutations. Here is how to get them, while eliminating those with missing values:

 

data have;
input id $ quarter $ a $ b $ c $ d $;
datalines;
LF first scooter car bus foot
LF second bike bus car .
;

data want;
set have;
array _a a--d;
do i = 1 to dim(_a);
    do j = 1 to comb(dim(_a), i);
        call allcomb(j, i, of _a{*});
        length comb $64;
        call missing (comb);
        do k = 1 to i;
            if missing(_a{k}) then leave;
            comb = catx("/", comb, _a{k});
            end;
        if k > i then output;
        end;
    end;
keep id quarter comb;
run;

proc print data=want noobs; run;
id 	quarter 	comb
LF 	first 	scooter
LF 	first 	foot
LF 	first 	bus
LF 	first 	car
LF 	first 	car/bus
LF 	first 	car/scooter
LF 	first 	car/foot
LF 	first 	bus/foot
LF 	first 	bus/scooter
LF 	first 	foot/scooter
LF 	first 	foot/scooter/bus
LF 	first 	foot/scooter/car
LF 	first 	foot/bus/car
LF 	first 	scooter/bus/car
LF 	first 	scooter/bus/car/foot
LF 	second 	bike
LF 	second 	car
LF 	second 	bus
LF 	second 	bus/car
LF 	second 	bus/bike
LF 	second 	car/bike
LF 	second 	bike/car/bus
PG
LFern
Obsidian | Level 7

Hello, thank you so much! This perfectly executes what I wanted. However, I'm new to arrays and would like to understand your code more. Why did you select "99" here? I changed it to a couple of other numbers but didn't notice a change in the final output.

array _b{99} $8 _temporary_;

Thank you again.

PGStats
Opal | Level 21

I took out that statement. It defined a temporary array that wasn't needed in the end. You would use it if you wanted to sort the list elements before concatenating them.

 

About the statement

 

array _b{99} $8 _temporary_;

 

The value 99 is meant to be greater than dim(_a), because array _b was intended to hold a copy of array _a. Ideally I would like to write

 

array _b{dim(_a)} $8 _temporary_;

 

but unfortunately SAS only allows constant numbers and "*" as array dimensions.

PG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 629 views
  • 2 likes
  • 4 in conversation