Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Generate combinations of categorical variables wit...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-23-2015 03:19 PM

I have a dataset with five categorical variables.

V1 have 3 levels: CO, KY, NY

V2 have 2 levels: A, F

V3 have 2 levels: H, P

V4 have 2 levels: C, P

V5 have 4 levels: A, B, C, D

The data below shows all combinations we observed. Based on this dataset, how I can find all valid combinations of different levels of these variables?

What I am looking for is combinations like [CO]..

.

.

.

.

.

How should I write my program to list all valid combinations? Any suggestion are highly appreciated. Thanks!

V1 V2 V3 V4 V5

CO A P C D

CO F H C D

CO F P C D

KY A P C D

KY F H C D

KY F P C D

NY A P C D

NY A P P A

NY A P P B

NY A P P C

NY A P P D

NY F H C D

NY F H P A

NY F H P B

NY F H P C

NY F H P D

NY F P C D

NY F P P A

NY F P P B

NY F P P C

NY F P P D

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HuiZ

04-23-2015 04:54 PM

I would start with something like:

proc freq data= have ;

tables v1* v2* v3 *v4* v5/list nocum nopercent;

run;

If I need an output data set then add an OUT=want to the tables statement.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

04-23-2015 05:06 PM

Thanks for your reply! Actually, the dataset I listed there is the output from the proc freq on a larger dataset. But that only give me the combinations with one level from each variable. I'm looking for a way to grab multiple levels from each variable. For example, for V2, it can take the form of A, F or (A, F). For V1, it can take CO, KY, NY, (CO, KY), (CO, NY), (KY, NY) OR (CO, KY, NY). I have actually figured out a way to list all POSSIBLE combinations like this, but the part that I got stuck is how to identify the valid combinations out of all possible combinations. As I mentioned, [KY]..

.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HuiZ

04-23-2015 06:18 PM

It looks like you have to explain what [CO.KY]..

.

Maybe you should provide a small example dataset with 3 variables and then show what result you are expecting.