Help using Base SAS procedures

Create frequency count crosstab matrix and for each combination calculate percentages of target var

Accepted Solution Solved
Reply
Contributor
Posts: 30
Accepted Solution

Create frequency count crosstab matrix and for each combination calculate percentages of target var

I have a following dataset

 

ID   attribute1   attribute2   attribute3     BAD (Y-> yes)
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N

 

I want to create frequency count matrix, for each combination of attribute tripped (atrribute = 1 means tripped) like following:

 

                          attribute1    attribute2    attribute3
attribute1                  _                3                  3
attribute2                 3                 _                 1
attribute3                 3                 1                  _

 

It represents counts of attributes tripped together.

Once i have the counts I need to calcualte %BAD in each combination like below.

 


                      attribute1     attribute2     attribute3
attribute1             _               12.50%         25%
attribute2        12.50%             _                   0%
attribute3            25%              0%                _

 

it represents out of 3 ID's for the combination atribute1 and atribute2 tripped, one ID went BAD -> so 12.5% BAD

And out of 3 ID's for the combination attribute1 and attribute1 tripped, two went BAD -> so 25% BAD

And the 1 ID for the combination attribute2 and attribute3 tripped, didnt went BAD -> so 0% BAD

 

Thanks


Accepted Solutions
Solution
‎09-30-2016 08:41 AM
Super User
Posts: 10,048

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

It is easy for IML.


data have;
input ID   attribute1   attribute2   attribute3     BAD $;
datalines;
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
;

proc iml;
use have nobs nobs;
read all var{attribute1 attribute2 attribute3} into x[c=vnames];
read all var{bad};
close;

xpx=x`*x;
diag=do(1,ncol(xpx)##2,ncol(xpx)+1);
xpx[diag]=.;
print xpx[r=vnames c=vnames l=''];

idx=loc(bad='Y');
temp=x[idx,];
per=(temp`*temp)/nobs;
per[diag]=.;
print per[r=vnames c=vnames l='' f=percent8.2];

quit;


View solution in original post


All Replies
Super User
Posts: 19,878

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

For part 1 you can use the output from Proc Corr - though you'll have to set the diagonal to missing manually.

 

data have;
input ID   var1 var2 var3   BAD $;
cards;
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
;
run;

ods output sscp=distance;
proc corr data=have sscp;
var var1-var3;
run;
proc print data=distance;
run;

https://gist.github.com/statgeek/a5184a4e1678d81e2643

Contributor
Posts: 30

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

Thank you for the solution. Any ideas on the second part?

Super User
Posts: 19,878

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

Nothing simple. 

Perhaps proc freq?

 

Here's a simple example but would need modification for the two way table.

https://gist.github.com/statgeek/e0903d269d4a71316a4e

 

Contributor
Posts: 30

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

thanks. will try that.

 

tried this:

 

PROC TABULATE

DATA=WORK.TEST

;

CLASS attribute1 / ORDER=UNFORMATTED MISSING;

CLASS attribute2 / ORDER=UNFORMATTED MISSING;

CLASS attribute3 / ORDER=UNFORMATTED MISSING;

TABLE /* Row Dimension */

attribute3

attribute1

attribute2,

/* Column Dimension */

attribute3

attribute1

attribute2;

;

FREQ BAD;

RUN;

 

but this gives. just need a way to supress proc tabulate output to just 1's.

 

                  attribute3   attribute1  attribute2 
                         0 1       1             0 1

 

attribute3               2 .       2             1 1
0     
1                        . 2       2             2 .
attribute1               2 2       4             3 1
1      
attribute2               1 2       3             3 .
0     
1                        1 .       1             . 1

 

     
     
     
     
     
Respected Advisor
Posts: 4,937

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

SQL and transpose can do this:

 

data have;
input ID   attribute1   attribute2   attribute3     BAD $;
datalines;
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
;

proc transpose data=have out=havet name=att;
by id bad;
var attribute:;
run;

proc sql;
create table wantt as
select 
    a.att, 
    b.att as col, 
    case when a.att=b.att then . 
         else sum(a.col1 and b.col1) end as n,
    case when a.att=b.att then . 
         else sum(a.col1 and b.col1 and a.BAD="Y") / 
            (select count(*) from have) end as p format=percent7.2
from 
    havet as a inner join 
    havet as b on a.id=b.id
group by a.att, b.att;
quit;

proc transpose data=wantt out=want1(drop=_name_);
by att;
id col;
var n;
run;

proc transpose data=wantt out=want2(drop=_name_);
by att;
id col;
var p;
run;

PG
Contributor
Posts: 30

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

Thank you @PGStats

Solution
‎09-30-2016 08:41 AM
Super User
Posts: 10,048

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

It is easy for IML.


data have;
input ID   attribute1   attribute2   attribute3     BAD $;
datalines;
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
;

proc iml;
use have nobs nobs;
read all var{attribute1 attribute2 attribute3} into x[c=vnames];
read all var{bad};
close;

xpx=x`*x;
diag=do(1,ncol(xpx)##2,ncol(xpx)+1);
xpx[diag]=.;
print xpx[r=vnames c=vnames l=''];

idx=loc(bad='Y');
temp=x[idx,];
per=(temp`*temp)/nobs;
per[diag]=.;
print per[r=vnames c=vnames l='' f=percent8.2];

quit;


Contributor
Posts: 30

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

@Ksharp this is awesome! Thanks for your help on this.

Valued Guide
Posts: 505

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

Survival Probabilities for combinations of risk factors

for an R solution based on the SAS Forum IML solution see
https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;f541a941.1610a

Slightly reformated

HAVE I have a following dataset

ID   Male Smoker Over60  Survived
TED   1       0       0        1
JIM   1       1       0        0
TEX   1       0       1        1
LIZ   0       1       1        0
JOE   1       1       0        1 Out of all Male Smokers only Joe survived 1/8 12.5% survival rate
TIM   1       1       0        0
TAD   1       0       1        1
TOM   1       0       1        0


WANT

For Male Smokers Jim, Joe and Tim. Only Joe survived 1/8 = .125
For Males  over 60 Tex, Tad and Tom, Tex and Tad Survived = 2/8 = .25
For Smokers over 60. NO one survived  0.0


           MALE  SMOKER    OVER60
MALE         _   12.50%       25%  (TAD + TOM survived)/8 records
SMOKER  12.50%       _         0%
OVER60     25%       0%         _

SAS SOLUTION
============

* after normalizing you can generalize with many sas procedures
  beter solution then this one?;

data _null_;
  retain  survive1 survive2 survive3 0;
  format survive: percent7.1;
  infile datalines eof=dne;
  input ID $3.  Male Smoker Over60  Survived;
  grp1='Male_Smoker_Survive   ';survive1=survive1+male*smoker*survived;
  grp2='Male_Over60_Survive   ';survive2=survive2+male*over60*survived;
  grp3='Smoker_Over60_Survive ';survive3=survive3+smoker*over60*survived;
  return;
  dne:
     _n_=_n_-1;
     survive1=survive1/_n_;
     survive2=survive2/_n_;
     survive3=survive3/_n_;
     put              @10 "Male "    @20 "Smoker"  @30 "Over60"   ;
     put @1 "Male"    @10 "-"        @20 Survive1  @30 Survive2 ;
     put @1 "Smoker"  @10 Survive1   @20 "-"       @30 Survive3 ;
     put @1 "Over60"  @10 Survive2   @20 Survive3  @30 "-"        ;
datalines;
TED 1 0 0 1
JIM 1 1 0 0
TEX 1 0 1 1
LIZ 0 1 1 0
JOE 1 1 0 1
TIM 1 1 0 0
TAD 1 0 1 1
TOM 1 0 1 0
;;;;
run;quit;

         Male      Smoker    Over60
Male     -         12.5%     25.0%
Smoker   12.5%     -         0.0%
Over60   25.0%     0.0%      -
Contributor
Posts: 30

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target

@Ksharp

 

idx=loc(bad='Y');
temp=x[idx,];

 

Hi, If the dataset doesnt have any Bad='Y'. I am getting the below error.

 

MPRINT(STIP): temp=x[idx,];

ERROR: (execution) Matrix has not been set to a value.

 

 

Is there any work around to this?

 

Thanks

Super User
Posts: 10,048

Re: Create frequency count crosstab matrix and for each combination calculate percentages of target


Oh. Sure. Try this one .


data have;
input ID   attribute1   attribute2   attribute3     BAD $;
datalines;
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N
;

proc iml;
use have nobs nobs;
read all var{attribute1 attribute2 attribute3} into x[c=vnames];
read all var{bad};
close;

xpx=x`*x;
diag=do(1,ncol(xpx)##2,ncol(xpx)+1);
xpx[diag]=.;
print xpx[r=vnames c=vnames l=''];

idx=loc(bad='Y');
if isempty(idx) then print "Bad don't have a Y";
else do;
temp=x[idx,];
per=(temp`*temp)/nobs;
per[diag]=.;
print per[r=vnames c=vnames l='' f=percent8.2];
end;
quit;



☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 765 views
  • 5 likes
  • 5 in conversation