turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Summary of statistics for ALL Combination from a s...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-28-2014 09:29 PM

Hi Everyone,

I have a dataset with a variable name target (taking value 0 or 1) and n independent variables.

I want to create a summary file that reports number of target=0 and number of target =1 for each combination of independent value.

data have;

input target a1 a2 a3 a4 a5 a6;

datalines;

0 1 9 1 0 8 1

1 1 0 1 1 5 0

1 1 9 2 3 1 1

1 2 3 0 2 0 6

0 2 9 1 7 0 0

0 3 3 0 9 0 3

1 2 1 1 2 1 2

0 1 2 0 3 0 4

;run;

Basically the summary will include All possible combination (from 1 factor to 6 factor from the 6 independent variables) such as:

If a1=1 how many observations have target=0 and how many have target=1.

If a1=1 and a2=9, how many observations ....

If a1=1 and a2=9 and a3=1 how many observations ....

...

If a1=1 and a2=9 and a3=1 and a4=0 and a5=8 and a6=1 ....

If a1=2 how many observations ...

...

...

I don't know how to do it so any help is very much appreciated.

I am thinking about creating a condition file that summary distinct values for each independent variable and then run some kind of Do Loop to create the file I want.

The code for distinct value is as below just in case you might need.

Thank you for your help as always.

HHC

proc summary data=have missing chartype;

class a:;

ways 1;

output out=distinct(drop=_type_ _freq_) / levels;

run;

proc print;

run;

proc sort data=distinct;

by _level_;

run;

data condition;

update distinct(obs=0) distinct;

by _level_;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to hhchenfx

01-28-2014 10:19 PM

Is it this?

input target a1-a6;

datalines;

0 1 9 1 0 8 1

1 1 0 1 1 5 0

1 1 9 2 3 1 1

1 2 3 0 2 0 6

0 2 9 1 7 0 0

0 3 3 0 9 0 3

1 2 1 1 2 1 2

0 1 2 0 3 0 4

;;;;

by target;

by target;

class a:;

output out=combo;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

01-29-2014 09:07 AM

Yes, it is, Data_null.

My intention is like that:

Get a subset of data by using the IF section, then work on that subset, export the result and go back to IF and work on another condition.

So your code is enable me to (1) got the Frequency and (2) got the list of all combination so that I can do the iteration for deeper analysis with each subsample.

Thank you.

HHC

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to data_null__

01-29-2014 11:24 AM

Hi Data_null,

Is there any quick change to your code if I want to change from "AND" to "OR" in my argument below?

Thank you,

HHC

If a1=1 how many observations have target=0 and how many have target=1.

If a1=1 OR a2=9, how many observations ....

If a1=1 OR a2=9 OR a3=1 how many observations ....

...

If a1=1 OR a2=9 OR a3=1 OR a4=0 OR a5=8 OR a6=1 ....

If a1=2 how many observations ...