BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ronein
Onyx | Level 15
Hello
Let's say I want to create a data set with all possible combinations of 9 categorical varaibles . Here are the possible values for each variable:
X1- 1,2,3
X2-1,2
X3-1,2
X4-1,2,3
X5-1,2
X6-1,2,3
X7-1,2,3,4
X8-1,2
X9-1,2,3
So there are 5184 possible combinations.
(3×2×2×3×2×3×4×2×3=5184)
What is the way to create the desired data set with all possible combinations (the data set will have 9 columns :x1-x9)
May you please show the solution via-
Way1- Cartesian product
Way2- Do loop
Or any other nice way
Cheers
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

If your actual values are sequential integers (not stated, what you show might be character values as "categorical" is a usage not variable type )

data want;
   do x1 = 1 to 3;
   do x2 = 1 to 2;
   do x3 = 1 to 2;
   do x4 = 1 to 3;
   do x5 = 1 to 2;
   do x6 = 1 to 3;
   do x7 = 1 to 4;
   do x8 = 1 to 2;
   do x9 = 1 to 3;
     output;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
run;

If your values are not actually sequential for a given variable then use a list of values for the do: do x= 2,5,27; or what have you. Same for character values.

Note I didn't bother to indent the code for each level of the DO because that many nested indents can get rendered pretty ugly with tabs set at anything more than 2 or 3 spaces.

 

The code needed to do anything like this with differing numbers of values for each variable would be extremely ugly attempting to use any of the combinatoric functions like ALLCOMB .

View solution in original post

5 REPLIES 5
yabwon
Amethyst | Level 16

Or maybe you could try to write something yourself, and then, in case of problems, ask? 😉

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



coder1234
Obsidian | Level 7

Are you trying to analyze something or just wanting to create data?

This will assist in determining the best route;

cartesian join vs cross-tabulation, pivot table?

ballardw
Super User

If your actual values are sequential integers (not stated, what you show might be character values as "categorical" is a usage not variable type )

data want;
   do x1 = 1 to 3;
   do x2 = 1 to 2;
   do x3 = 1 to 2;
   do x4 = 1 to 3;
   do x5 = 1 to 2;
   do x6 = 1 to 3;
   do x7 = 1 to 4;
   do x8 = 1 to 2;
   do x9 = 1 to 3;
     output;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
  end;
run;

If your values are not actually sequential for a given variable then use a list of values for the do: do x= 2,5,27; or what have you. Same for character values.

Note I didn't bother to indent the code for each level of the DO because that many nested indents can get rendered pretty ugly with tabs set at anything more than 2 or 3 spaces.

 

The code needed to do anything like this with differing numbers of values for each variable would be extremely ugly attempting to use any of the combinatoric functions like ALLCOMB .

StatDave
SAS Super FREQ

Or more simply:

proc plan; 
factors x1=3 ordered x2=2 ordered x3=2 ordered x4=3 ordered x5=2 ordered 
  x6=3 ordered x7=4 ordered x8=2 ordered x9=3 ordered / noprint;
output out=plan;
run;

Remove the ORDERED keywords if you want selections to be random.

Ksharp
Super User

Your PROC PLAN is just creating a Cartesian Product, just like the following PROC SQL.

 

proc sql;
create table want as
select * from
have(keep=x1),
have(keep=x2),
have(keep=x3),
have(keep=x4),
have(keep=x5),
have(keep=x6),
have(keep=x7),
have(keep=x8),
have(keep=x9)
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1398 views
  • 9 likes
  • 6 in conversation