turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- All Possible Combinations Loop

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-06-2013 08:44 AM

Hi,

I am trying to build something that combines all values of all variables using SQL within SAS. For Example:

A B C D

A1 B1 C1 D1

A2 B2 C2

A3 C3

A4 C4

C5

A1

B1

C1

D1

A1 B1

A1 B2

A1 B1 C1 D1 and so on..

Basically making all possible filter combinations on variables. I have been a little bit succesful but the problem is with storage because the combinations can be so many.

I wanted your help in trying to make a SAS procedure to make all these filter combinations and also check performance of each combination on a criteria.

But more importantly I am trying to understand how to make the combinations with out killing my storage which is very limited. This is my code:

DECLARE @i INT

SET @i = 0

WHILE (@i < 15)

BEGIN

SET @i = @i + 1

EXEC ('CREATE TABLE VAR' + @i + '(VAR' + @i + ' VARCHAR(50))')

ENDINSERT INTO VAR1 (VAR1) SELECT DISTINCT VAR1 FROM [Training Data]

DECLARE @i INT

SET @i = 0

WHILE (@i < 15)

BEGIN

SET @i = @i + 1

EXEC('INSERT INTO VAR' + @i + ' (VAR' + @i + ')' + ' SELECT DISTINCT VAR' + @i + ' FROM [Training Data]')

END

CREATE TABLE ALL_COMB (VAR1 VARCHAR(50), VAR2 VARCHAR(50), VAR3 VARCHAR(50), VAR4 VARCHAR(50), VAR5 VARCHAR(50), VAR6 VARCHAR(50), VAR7 VARCHAR(50), VAR8 VARCHAR(50), VAR9 VARCHAR(50), VAR10 VARCHAR(50), VAR11 VARCHAR(50), VAR12 VARCHAR(50), VAR13 VARCHAR(50), VAR14 VARCHAR(50), VAR15 VARCHAR(50))

INSERT INTO ALL_COMB (VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9, VAR10, VAR11, VAR12, VAR13, VAR14, VAR15) SELECT * FROM VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9, VAR10, VAR11, VAR12, VAR13, VAR14, VAR15

ORDER BY VAR1, VAR2, VAR3, VAR4, VAR5, VAR6, VAR7, VAR8, VAR9, VAR10, VAR11, VAR12, VAR13, VAR14, VAR15;TRUNCATE TABLE VAR1

select sum(Outcome)/count(Outcome) as Perc, count(Outcome) as Lines from [Training Data] where var1 = 1

SELECT * FROM ALL_COMB

Accepted Solutions

Solution

02-07-2013
03:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 03:10 PM

I'm still a bit confused. Given your latest example, you would only have 31 combinations. You have 5 values, therefore:

5!/(1!*(5-1)! + 5!/(2!*(5-2)! + 5!/(3!*(5-3)! + 5!/(42!*(5-4)! +1

There is code provided in a SAS TS Doc that you can use: TS-DOC: TS-498 - Generating Combinations and Permutations

Here is an example for obtaining the combinations based on your example. The code is also on that page for getting all permutations.

data test;

infile cards missover;

input (A B)(:$2.);

cards;

A1 B1

A2 B2

A3 .

;;;;

run;

proc transpose data=test

out=need (keep=col1);

by a b;

var a b;

run;

proc sql noprint;

select col1

into :vars separated by ","

from need

;

quit;

%macro combo(r)/parmbuff;

%let i=2;

%let things=;

%do %while (%Qscan(&syspbuff,&i,%STR(,%))) ne );

%let p&i="%Qscan(&syspbuff,&i,%STR(,%)))";

%if &i=2 %then %let things=&&p&i;

%else %let things=&things,&&p&i;

%let i=%eval(&i+1);

%end;

%let n=%eval(&i-2);

data combo;

keep v1-v&r;

array word $8 w1-w&n (&things);

array rr (*) r1-r&r;

array v $8 v1-v&r;

%do i=1 %to &r; /* create the DO LOOPs */

%if &i=1 %then %do;

do r&i=1 to &n-(&r-&i);

%end;

%else %do;

do r&i=r%eval(&i-1)+1 to &n-(&r-&i);

%end;

%end;

do k=1 to &r; /* select subscripted items */

v(k)=word (rr(k)); /* for a SAS data set */

put v(k) ' ' @; /* for log */

end;

put; /* writes to log */

output; /* writes to a SAS data set */

%do i=1 %to &r;

end; /* create ENDs for the DO LOOPs */

%end;

put;

run;

%mend combo;

%combo(5,&vars.)

data want;

set combo;

run;

%combo(4,&vars.)

proc append base=want data=combo;

run;

%combo(3,&vars.)

proc append base=want data=combo;

run;

%combo(2,&vars.)

proc append base=want data=combo;

run;

%combo(1,&vars.)

proc append base=want data=combo;

run;

Note: the link I provided to the combinations and permutations macros apparently is no longer on support.sas.com. However, you can still find them at: http://www.urz.uni-heidelberg.de/statistik/sas/doc/ts498-combperm.txt

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-06-2013 04:15 PM

akberali67,

You can thank data _null_; by marking their post as the answer. That's always appreciated!

Regards,

Huey

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 02:32 AM

:smileyconfused: @Huey - You need a browser update.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 12:57 PM

Actually, I figured what the code is doing and it is wrong. It simply retains the blank cells with the observations above which are non-blank but more importantly only makes combinations of same rows but does not combine multiple cells.

Any other solution would be helpful.

I am trying to build something that combines all values of all variables for filter combinations I can apply. For Example:

A B C D

A1 B1 C1 D1

A2 B2 C2

A3 C3

A4 C4

C5

A1

B1

C1

D1

A1 B1

A1 B2

........

A1 B1 C1 D1 and so on

Basically all permutations and combinations possible to make filters. I know the combinations will be in Zillions and am trying to find a way to limit them also. Thanks in advance

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 01:20 PM

: I'd help but I'm not sure what exactly you are looking for. As for Huey's comment, you only marked data_null's suggestion as "helpful", not as being "correct".

If you are only concerned about the ordering of the records in the file resulting from data_null's suggestion, you can always sort it. e.g.:

data comb;

set comb;

count=count(_type_,'1');

run;

proc sort data=comb;

by count descending _type_ ;

run;

You said all combinations and permutations. Do you mean that you would want to have "A1 B1" and "B1 A1" as two separate records in your results (i.e., where order IS relevant)?.

As for how you can use the results, you can always use a datastep to create an include file that specifies which combinations/permutations to use for any particular analysis or set of analyses.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 01:32 PM

Hi,

I did mark it as correct, unmarked it after I figured it was wrong a few minutes ago.

High level of what I am trying to do is find all filter combinations possible to filter a dataset.

Like an IF condition that substitutes a distinct value of each variable.

So:

A B C D

A1 B1 C1 D1

A2 B2 C2

A3 C3

A4 C4

C5

Trying to get 4! * 2! * 5! * 1! = 5,760 possible combinations but I am not trying to get the number, I am trying to get the unique combinations to use as filters, one by one. With the complexity of my dataset, it will run into zillions of combinations which I want to test one by one based on a condition before deciding to store it or delete it. I hope I have managed to explain what I am trying to do. Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 02:19 PM

: What I am trying to do is to filter a dataset on all possible combinations.

My code does create combinations in a separate dataset but I am unable to completely run it because of the size which will run up to 800 TB and I dont have that space.

Is there a way to slow SAS/SQL down and generate these combination one by one with a condition that checks performance of each combination against a criteria specified?

So, that I can only store a combination that peforms well against my criteria and delete the ones that dont.

Thanks in advance for the help!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 02:46 PM

akberali67,

Sorry about the correct answer thing. I'm not sure what went on there.

In any case, how are you checking the performance of a given filter set? What are the criteria?

I ask because it will have to be a very quick check to be able to run against 800TB worth of filter sets.

Thanks,

Huey

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 02:56 PM

I have more variables that are not part of the combination variables and its a summary that measures performance against a known standard but the point is to not create that 800 TB, I already know how to do that. I dont have 800 TB of space, so need something that will measure performance and decide whether to store it or delete it.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 03:05 PM

If you could generate this type of loop, would it be helpful?

data want;

set have;

do a='A1', 'A2', 'A3', 'A4';

do b='B1', 'B2';

* Test against filtering conditions;

end;

end;

run;

Of course, you would have to expand it to include the other variables, and to automate the process. But lots of people here could help with that.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 03:35 PM

Not really, what this will do again is that it wont try every possible combination. I am starting to think this loop is not possible without eating up all the storage.

The sample dataset I showed you can make 5,760 combinations, I just dont know how to make those combinations without killing the storage space.

Thanks for trying to help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 07:36 PM

akberali67,

OK, one more try then. Would this be a viable approach (again, it would need to be expanded to more variables and automated):

data want;

set have;

do _a1='A1', ' ';

do _a2='A2, ' ';

do _a3='A3', ' ';

do _a4='A4', ' ';

do _a5='A5', ' ';

do _b1='B1', ' ';

do _b2='B2', ' ';

* testing and filtering;

end;end;end;end;end;end;end;

run;

It gets all combinations (including all blanks) but not all permutations. Would that do the trick?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 11:45 PM

Hi,

I am not sure what I am doing wrong but when I run this code:

data have;

infile cards missover;

input (A B)(:$2.);

cards;

A1 B1

A2 B2

A3 .

A4 .

A5 .

;;;;

run;

data want;

set have;

do _a1='A1', ' ';

do _a2='A2', ' ';

do _a3='A3', ' ';

do _a4='A4', ' ';

do _a5='A5', ' ';

do _b1='B1', ' ';

do _b2='B2', ' ';

* testing and filtering;

end;end;end;end;end;end;end;

run;

I only get the same data with additional blank variables _a1 to _b2.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-08-2013 09:55 AM

That's correct. You will end up with all blank variables using my code. However, you will have looped through every combination along the way, without having to store them. So the idea is to test/filter at the innermost portion of the loop since each time the innermost loop is reached that is a unique combination.

The issue of combinations vs. permutations still exists.

Good luck.

Solution

02-07-2013
03:10 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-07-2013 03:10 PM

I'm still a bit confused. Given your latest example, you would only have 31 combinations. You have 5 values, therefore:

5!/(1!*(5-1)! + 5!/(2!*(5-2)! + 5!/(3!*(5-3)! + 5!/(42!*(5-4)! +1

There is code provided in a SAS TS Doc that you can use: TS-DOC: TS-498 - Generating Combinations and Permutations

Here is an example for obtaining the combinations based on your example. The code is also on that page for getting all permutations.

data test;

infile cards missover;

input (A B)(:$2.);

cards;

A1 B1

A2 B2

A3 .

;;;;

run;

proc transpose data=test

out=need (keep=col1);

by a b;

var a b;

run;

proc sql noprint;

select col1

into :vars separated by ","

from need

;

quit;

%macro combo(r)/parmbuff;

%let i=2;

%let things=;

%do %while (%Qscan(&syspbuff,&i,%STR(,%))) ne );

%let p&i="%Qscan(&syspbuff,&i,%STR(,%)))";

%if &i=2 %then %let things=&&p&i;

%else %let things=&things,&&p&i;

%let i=%eval(&i+1);

%end;

%let n=%eval(&i-2);

data combo;

keep v1-v&r;

array word $8 w1-w&n (&things);

array rr (*) r1-r&r;

array v $8 v1-v&r;

%do i=1 %to &r; /* create the DO LOOPs */

%if &i=1 %then %do;

do r&i=1 to &n-(&r-&i);

%end;

%else %do;

do r&i=r%eval(&i-1)+1 to &n-(&r-&i);

%end;

%end;

do k=1 to &r; /* select subscripted items */

v(k)=word (rr(k)); /* for a SAS data set */

put v(k) ' ' @; /* for log */

end;

put; /* writes to log */

output; /* writes to a SAS data set */

%do i=1 %to &r;

end; /* create ENDs for the DO LOOPs */

%end;

put;

run;

%mend combo;

%combo(5,&vars.)

data want;

set combo;

run;

%combo(4,&vars.)

proc append base=want data=combo;

run;

%combo(3,&vars.)

proc append base=want data=combo;

run;

%combo(2,&vars.)

proc append base=want data=combo;

run;

%combo(1,&vars.)

proc append base=want data=combo;

run;

Note: the link I provided to the combinations and permutations macros apparently is no longer on support.sas.com. However, you can still find them at: http://www.urz.uni-heidelberg.de/statistik/sas/doc/ts498-combperm.txt