## Create summary for each combination of a set of variable

Solved
Super Contributor
Posts: 506

# Create summary for each combination of a set of variable

Hello Everyone,

I have the following problem and would like to seek your help.

My file has variable Target (taking value 0 or 1) and 2 sets of condition (each take value of 0 or 1): A1, A2 ... An AND B1, B2 ... Bn.

I want to create a summary of number of taget=0 and target=1 for each combination of the 2 set of condition

say A1B1 (00, 01, 10, 11), A1B2...

I can write the simple macro to deal with them 1 by 1 as below. But if the number of condition getting large, just this macro is not helpful.

So I guess what I look for is a kind of "do for A=A1 to An" and another "do for B=B1 to Bn".

Any help is very much appreciated.

Thank you,

HHC

data have;

input target a1 a2 a3 b1 b2 b3;

datalines;

0 1 1 1 1 1 1

1 0 0 1 1 0 0

1 1 0 0 0 1 1

1 0 0 0 1 0 1

0 1 1 1 1 0 0

0 0 0 0 1 0 0

1 0 1 1 0 1 0

0 1 0 0 0 0 0

;run;

proc sort data=have;

by a1 b1 target ;run;

proc means noprint;

by a1 b1 target;

output out=summary_final (drop = _type_ _freq_)

n=n_a1b1; run;

data summary_final; set summary_final;

n=_n_; run; *this variable will be used to merge with the macro output below;

%macro analyze (X=, Y=);

proc sort data=have;

by &X &Y target ;run;

proc means noprint;

by &X &Y target;

output out=summary_macro (drop = _type_ _freq_)

n=n_&X&Y; run;

data summary_macro; set summary_macro;

n=_n_; run; *this variable will be used to merge with the macro output below;

proc sql;

create table summary_final as select a.*, b.n_&X&Y

from summary_final as a left join summary_macro as b

on a.n=b.n;quit;

%mend;

%analyze (X=a1,Y=b2);

%analyze (X=a1,Y=b3);

.....

Accepted Solutions
Solution
‎06-25-2013 11:45 PM
Super User
Posts: 10,778

## Re: Create summary for each combination of a set of variable

```data have;
input target a1-a3 b1-b3;
datalines;
0 1 1 1 1 1 1
1 0 0 1 1 0 0
1 1 0 0 0 1 1
1 0 0 0 1 0 1
0 1 1 1 1 0 0
0 0 0 0 1 0 0
1 0 1 1 0 1 0
0 1 0 0 0 0 0
;;;;
run;
data temp(keep=_a_: _b_: target);
set have;
length _a_name _b_name \$ 20;
array _a{*} a:;
array _b{*} b:;
do i=1 to dim(_a);
do j=1 to dim(_b);
_a_name=vname(_a{i}); _b_name=vname(_b{j});
_a_value=_a{i}; _b_value=_b{j};
output;
end;
end;
run;
proc sql;
create table want as
select _a_name,_b_name,_a_value,_b_value,sum(target=1)/sum(target=0) as per
from temp
group by  _a_name,_b_name,_a_value,_b_value;
quit;

```

Ksharp

All Replies
Posts: 3,852

## Re: Create summary for each combination of a set of variable

CLASS and TYPES.

data have;
input target a1-a3 b1-b3;
datalines;
0 1 1 1 1 1 1
1 0 0 1 1 0 0
1 1 0 0 0 1 1
1 0 0 0 1 0 1
0 1 1 1 1 0 0
0 0 0 0 1 0 0
1 0 1 1 0 1 0
0 1 0 0 0 0 0
;;;;
run;
proc summary data=have chartype;

class a: b: target;
types (a*(b*target;
output  out=sumtest;
run;
Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Wow!!! thank you, Data_null_.

I am really amazed to see your code.

I still want to do a number of other step for each combination and for illustration purpose, I skip some of them (if I had known of the Class and Type, I would have include those steps).

So I wonder if you guys could have other solution so that I can have more flexibility in term of data manipulating.

Thank you.

HHC

Super User
Posts: 13,523

## Re: Create summary for each combination of a set of variable

The proc summary output is pretty flexible for selecting the combination types you are looking for with by using the _TYPE_ automatic variable.

Super User
Posts: 23,700

## Re: Create summary for each combination of a set of variable

You can also look into the classdata= option, where you can specify the specific combinations that you're looking for, rather than all.

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Hi Data_null_,

If the list of condition is only A1 A2 A3... An and I want to create the summary of Target for each pair say A1A2; A1A3 ....

This code below can be used but it (a b c target) the same.

Is there any better option to do it?

Thank you,

HHC

proc summary data=have2 chartype;

class a b c: target;

ways 3;

output  out=sumtest5;

run;

Super User
Posts: 10,778

## Re: Create summary for each combination of a set of variable

What output do you like ?

```data have;
input target a1-a3 b1-b3;
datalines;
0 1 1 1 1 1 1
1 0 0 1 1 0 0
1 1 0 0 0 1 1
1 0 0 0 1 0 1
0 1 1 1 1 0 0
0 0 0 0 1 0 0
1 0 1 1 0 1 0
0 1 0 0 0 0 0
;;;;
run;
data temp(keep=_a_: _b_: target);
set have;
length _a_name _b_name \$ 20;
array _a{*} a:;
array _b{*} b:;
do i=1 to dim(_a);
do j=1 to dim(_b);
_a_name=vname(_a{i}); _b_name=vname(_b{j});
_a_value=_a{i}; _b_value=_b{j};
output;
end;
end;
run;

proc freq data=temp noprint;
table _a_name*_b_name*_a_value*_b_value*target/list out=want;
run;

```

Ksharp

Message was edited by: xia keshan

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Thank you for your different approaches.

At first, I was looking for the code by Ksharp. Now since reading Data_null_ and other, I find it more convenient to use Proc Summary.

The ultimate goal of course is to calculate the ratio of target=1/target=0 .  I use the below lag code to calculate this ratio with the condition of the same type.

data sumtest; set sumtest;

drop lf lt ltype;

lf=lag(freq);

lt=lag(target);

ltype=lag(type);

if target=1 and lt=0 and type=ltype then ratio=freq/lf; run;

Because there is no guarantee that all combinations of condition has a value, ONE potential problem is that: I might take the (number of target=1) for 1 condition and divide by (number of target=0) for another condition.

it is the situation (with the below data) for a=0, c=0  AND a=0,c=1.

So I wonder if there is any solution to that.

HHC

data
have;

input
target a b c d e f ;

datalines;

0 1 1 1 1 1 1

1 0 0 1 1 0 0

1 1 0 0 0 1 1

0 1 1 1 1 0 0

0 0 0 0 1 0 0

1 0 1 1 0 1 0

0 1 0 0 0 0 0

1 1 1 1 1 1 1

;;;;

run;

proc summary data=have chartype;

class a b c target;

ways 3;

output  out=sumtest;

run;

Super User
Posts: 10,778

## Re: Create summary for each combination of a set of variable

Post the final output you need and explain your logic .

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

 a b c target _TYPE_ _FREQ_ ratio 0 0 0 0111 2 0 0 1 0111 1 0.5 0 1 1 0111 1 1 1 0 0111 2 1 1 1 0111 2 1 0 0 0 1011 1 0 1 1 1011 2 2 1 0 0 1011 1 1 0 1 1011 1 1 1 1 0 1011 2 1 1 1 1011 1 0.5 0 0 0 1101 1 0 0 1 1101 1 1 0 1 1 1101 1 1 0 0 1101 1 1 0 1 1101 1 1 1 1 0 1101 2 1 1 1 1101 1 0.5 0 0 0 1110 1 0 0 1 1110 1 0 1 1 1110 1 1 0 0 1110 2 1 1 1 1110 3

The output I need is the table above BUT the ratio highlighted should NOT be there since it compares the _FREQ_ of Target for 2 different combinations  (a=0, c=0) and (a=0, c=1). Ideally, we should have 1 line in between those highlighted for a=0, c=0 and Target=1.

(not sure why table inflated to that big)

Thank you,

HHC

Solution
‎06-25-2013 11:45 PM
Super User
Posts: 10,778

## Re: Create summary for each combination of a set of variable

```data have;
input target a1-a3 b1-b3;
datalines;
0 1 1 1 1 1 1
1 0 0 1 1 0 0
1 1 0 0 0 1 1
1 0 0 0 1 0 1
0 1 1 1 1 0 0
0 0 0 0 1 0 0
1 0 1 1 0 1 0
0 1 0 0 0 0 0
;;;;
run;
data temp(keep=_a_: _b_: target);
set have;
length _a_name _b_name \$ 20;
array _a{*} a:;
array _b{*} b:;
do i=1 to dim(_a);
do j=1 to dim(_b);
_a_name=vname(_a{i}); _b_name=vname(_b{j});
_a_value=_a{i}; _b_value=_b{j};
output;
end;
end;
run;
proc sql;
create table want as
select _a_name,_b_name,_a_value,_b_value,sum(target=1)/sum(target=0) as per
from temp
group by  _a_name,_b_name,_a_value,_b_value;
quit;

```

Ksharp

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Thank you, Ksharp.

First, I find that the temp file creates 17 duplicate observation. I am not sure what cause it.

sort data=temp out=temp1 nodupkey;

target _a_name _b_name _a_value _b_value; run;

Second, if there is only 1 list of condition (a b c d e f) instead of a1-a3 and b1-b3, and I want to do the combination of this list with the following code, there are still a lot of duplicate observation.

definitely, the problem is kind of AB and then BA.

I don't know how to eliminate this. I wonder if you could help me.

Thank you,

HHC

data temp1(keep=_a_: _b_: target);
set have;

array _a{*}  a b c;
do i=1 to dim(_a);
do j=i+1 to dim(_a);

if _a{i}^=_b{j} then do;

_a_name=vname(_a{i}); _b_name=vname(_b{j});
_a_value=_a{i}; _b_value=_b{j};
output;
end;
end;
end;
run;

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Sorry Ksharp.

I think I am wrong somewhere. Your code is perfect.

Might be it is too much for me today.

Thank you,

HHC

Super Contributor
Posts: 506

## Re: Create summary for each combination of a set of variable

Learned it now Ksharp!

Many thanks for a new method (to me).

HHC

🔒 This topic is solved and locked.