BookmarkSubscribeRSS Feed
hhchenfx
Barite | Level 11

Hi Everyone,

I have a dataset with a variable name target (taking value 0 or 1) and n independent variables.
I want to create a summary file that reports number of target=0 and number of target =1 for each combination of independent value.

data have;

  input target a1 a2 a3 a4 a5 a6;

  datalines;

0 1 9 1 0 8 1

1 1 0 1 1 5 0

1 1 9 2 3 1 1

1 2 3 0 2 0 6

0 2 9 1 7 0 0

0 3 3 0 9 0 3

1 2 1 1 2 1 2

0 1 2 0 3 0 4

;run;

Basically the summary will include All possible combination (from 1 factor to 6 factor from the 6 independent variables) such as:

If a1=1          how many observations have target=0 and how many have target=1.
If a1=1 and a2=9,       how many observations ....
If a1=1 and a2=9 and a3=1      how many observations ....
...
If a1=1 and a2=9 and a3=1 and a4=0 and a5=8 and a6=1      ....

If a1=2 how many observations ...
...
...

I don't know how to do it so any help is very much appreciated.

I am thinking about creating a condition file that summary distinct values for each independent variable and then run some kind of Do Loop to create the file I want.
The code for distinct value is as below just in case you might need.

Thank you for your help as always.

HHC


proc summary data=have missing chartype;
   class a:;
   ways 1;
   output out=distinct(drop=_type_  _freq_) / levels;
   run;
proc print;
   run;

proc sort data=distinct;
   by _level_;
   run;
data condition;
   update distinct(obs=0) distinct;
   by _level_;
   run;

3 REPLIES 3
data_null__
Jade | Level 19

Is it this?

data have;
  input target a1-a6;
  datalines;
0 1 9 1 0 8 1
1 1 0 1 1 5 0
1 1 9 2 3 1 1
1 2 3 0 2 0 6
0 2 9 1 7 0 0
0 3 3 0 9 0 3
1 2 1 1 2 1 2
0 1 2 0 3 0 4
;;;;
run;
proc sort;
  
by target;
   run;
proc print;
  
run;
proc summary data=have chartype missing;
  
by target;
   class a:;
   output out=combo;
   run;
proc print;
  
run;
hhchenfx
Barite | Level 11

Yes, it is, Data_null.

My intention is like that:

Get a subset of data by using the IF section, then work on that subset, export the result and go back to IF and work on another condition.

So your code is enable me to (1) got the Frequency and (2) got the list of all combination so that I can do the iteration for deeper analysis with each subsample.

Thank you.

HHC

hhchenfx
Barite | Level 11

Hi Data_null,

Is there any quick change to your code if I want to change from "AND" to "OR" in my argument below?

Thank you,

HHC

If a1=1          how many observations have target=0 and how many have target=1.
If a1=1 OR a2=9,       how many observations ....
If a1=1 OR a2=9 OR a3=1      how many observations ....
...
If a1=1 OR a2=9 OR a3=1 OR a4=0 OR a5=8 OR a6=1      ....

If a1=2 how many observations ...

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 812 views
  • 0 likes
  • 2 in conversation