Help using Base SAS procedures

Create freq crosstab matrix and for each combination calculate percentages of target var

Accepted Solution Solved
Reply
Contributor
Posts: 30
Accepted Solution

Create freq crosstab matrix and for each combination calculate percentages of target var

Hi , I was working with this before and this has been answered in the post

https://communities.sas.com/t5/SAS-Procedures/Create-frequency-count-crosstab-matrix-and-for-each-co...

 

 

I have a following dataset

 

ID   attribute1   attribute2   attribute3     BAD (Y-> yes)
1        1               0               0                 Y
2        1               1               0                 N
3        1               0               1                 Y
4        0               1               1                 N
5        1               1               0                 Y
6        1               1               0                 N
7        1               0               1                 Y
8        1               0               1                 N

 

I want to create frequency count matrix, for each combination of attribute tripped (atrribute = 1 means tripped) like following:

 

                          attribute1    attribute2    attribute3
attribute1                  _                3                  3
attribute2                 3                 _                 1
attribute3                 3                 1                  _

 

It represents counts of attributes tripped together.

Once i have the counts I need to calcualte %BAD in each combination like below.

 


                      attribute1     attribute2     attribute3
attribute1             _               12.50%         25%
attribute2        12.50%             _                   0%
attribute3            25%              0%                _

 

it represents out of 3 ID's for the combination atribute1 and atribute2 tripped, one ID went BAD -> so 12.5% BAD

And out of 3 ID's for the combination attribute1 and attribute1 tripped, two went BAD -> so 25% BAD

And the 1 ID for the combination attribute2 and attribute3 tripped, didnt went BAD -> so 0% BAD

 

 

 

I need to expand this analysis t0 3 variable combinations to get more meaningful results.

 

My dataset look like:

 

ID  

gradetypescoreattribute1 attribute2attribute3attribute4attribute5attribute6BAD (Y-yes/N- No)
1       A1100-1101              0              0                1              0              0                Y
2       A1120-1301              1              0                1              1              0                N
3       B270-801              0              1                1              0              1                Y
4       C330-400              1              1                0              1              1                N
5       A340-501              1               0                1              1               0                Y
6       B480-901              1              0                1              1              0                 N
7       C240-501              0              1                1              0              1                 Y
8       C430-401              0              1                1              0              1                 N

 

 

Im trying to get to this analysis: I am showing values only for grade A as example:

 

    frequency Attributes/Attribute combinations    
   Attribute1Attribute2Attribute4Attr1&Attr4Attr1&Attr2Attr1&Attr2&Attr4
BADgradescore      
yesA100-1101 11  
  110-120      
  120-130      
 B90-100      
  80-90      
  70-80      
 C40-50      
  30-40      
         
NoA100-110212211
  110-120      
  120-130      
 B90-100      
  80-90      
  70-80      
 C40-50      
  30-40      

 

I was looking at http://www.datavis.ca/papers/sugi/mdarray/mdarray.pdf on mutildimensional arrays and mosaic displays. Wasn't quite sure how to get to such kind of analysis with this type of data. Any suggestions?

 

 

Thanks


Accepted Solutions
Solution
‎02-07-2017 04:02 PM
Regular Contributor
Posts: 217

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

Try this

Run the TypeFormat macro declared in http://www2.sas.com/proceedings/sugi29/045-29.pdf, and then run this code

 

proc sort data=have out=srtd_have;
   by id grade type score_range bad;
run;

%TypeFormat(formatname=testtyp,var=a1 a2 a3 a4 a5);
proc summary data=WORK.srtd_have;
   by id grade type score_range bad;
   class a1-a5;
   ways 1 2 3 4 5;
   format _type_ testtyp.;
   output out=work.sum;
run;

 

Once the work.sum table is created, then you can slice&dice it as you want.

 

Hope this helps,

Ahmed

View solution in original post


All Replies
Valued Guide
Posts: 505

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

SAS: Intersection counts for all combinations of column logicals (0/1)

Just the first part.
I recoded the 0s to missing so that proc corr would cout the non-missing intersecitons proc summary may work and be much faster? HAVE ==== Up to 40 obs WORK.HAVE total obs=8 Obs A1 A2 A3 1 1 . . 2 1 1 . 3 1 . 1 4 . 1 1 5 1 1 . 6 1 1 . 7 1 . 1 8 1 . 1 WANT === Up to 40 obs from mat total obs=3 Obs VARIABLE NA1 NA2 NA3 1 A1 7 3 3 2 A2 3 4 1 --> A2=1 and A3=1 se below 3 A3 3 1 4 DETAILS A2 A3 . . 1 . . 1 1 1 Only this one 1 . 1 . . 1 . 1 WORKING CODE ============ proc corr data=wrk.have ; var _numeric_; with _numeric_; FULL SOLUTION ============= options validvarname=upcase; data have; input a1 a2 a3; cards4; 1 . . 1 1 . 1 . 1 . 1 1 1 1 . 1 1 . 1 . 1 1 . 1 ;;;; run;quit; %utl_submit_wps64(' libname wrk "%sysfunc(pathname(work))"; ods output pearsoncorr=wrk.want(keep=variable n:) ; proc corr data=wrk.have ; var _numeric_; with _numeric_; run;quit; '); proc print data=want width=min; run;quit; Obs VARIABLE NA1 NA2 NA3 1 A1 7 3 3 2 A2 3 4 1 3 A3 3 1 4
Contributor
Posts: 30

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

Posted in reply to rogerjdeangelis

@rogerjdeangelis thanks for the solution for variable intersections. I also wanted to at the correlation at the combination of these variables.

 

ex:

two combinations:

A1, A2

A1, A3

A1, A4

A1, A5

A2, A3

A2, A4

A2, A5

A3, A4

A3, A5

A4, A5

Three occuring together:

A1, A2, A3

A1, A2, A4

A1, A2, A5

A1, A3, A4

A1, A3, A5,

A1, A4, A5

Four occuring together:

A1, A2, A3, A4

A1, A2, A4, A5

A1, A3, A4, A5

All five :

A1, A2, A3, A4, A5

 

 

 

 

Contributor
Posts: 30

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

@rogerjdeangelis

So if we look at it like a report I need to build something like this. Counts for various combination of these occurances.

 

   A1, A2A1, A3A1, A4A1, A5A2, A3A2, A4A2, A5A3, A4A3, A5A4, A5A1, A2, A3A1, A2, A4A1, A2, A5A1, A3, A4A1, A3, A5,A1, A4, A5A1, A2, A3, A4A1, A2, A4, A5A1, A3, A4, A5A1, A2, A3, A4, A5
BADgradescore                    
yesA100-110                    
  110-120                    
  120-130                    
 B90-100                    
  80-90                    
  70-80                    
 C40-50                    
  30-40                    
                       
NoA100-110                    
  110-120                    
  120-130                    
 B90-100                    
  80-90                    
  70-80                    
 C40-50                    
  30-40                    

 

 

Thanks

Valued Guide
Posts: 505

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

 

SAS: Intersection counts for all combinations of column logicals (0/1)

Just the first part.
I recoded the 0s to missing so that proc corr would cout the non-missing intersecitons proc summary may work and be much faster? HAVE ==== Up to 40 obs WORK.HAVE total obs=8 Obs A1 A2 A3 1 1 . . 2 1 1 . 3 1 . 1 4 . 1 1 5 1 1 . 6 1 1 . 7 1 . 1 8 1 . 1 WANT === Up to 40 obs from mat total obs=3 Obs VARIABLE NA1 NA2 NA3 1 A1 7 3 3 2 A2 3 4 1 --> A2=1 and A3=1 se below 3 A3 3 1 4 DETAILS A2 A3 . . 1 . . 1 1 1 Only this one 1 . 1 . . 1 . 1 WORKING CODE ============ proc corr data=wrk.have ; var _numeric_; with _numeric_; FULL SOLUTION ============= options validvarname=upcase; data have; input a1 a2 a3; cards4; 1 . . 1 1 . 1 . 1 . 1 1 1 1 . 1 1 . 1 . 1 1 . 1 ;;;; run;quit; %utl_submit_wps64(' libname wrk "%sysfunc(pathname(work))"; ods output pearsoncorr=wrk.want(keep=variable n:) ; proc corr data=wrk.have ; var _numeric_; with _numeric_; run;quit; '); proc print data=want width=min; run;quit; Obs VARIABLE NA1 NA2 NA3 1 A1 7 3 3 2 A2 3 4 1 3 A3 3 1 4

 

Contributor
Posts: 30

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

Posted in reply to rogerjdeangelis

data have;

input id grade$ type score_range$ a1$ a2$ a3$ a4$ a5$ bad$;

datalines;

1 a 1 140-150 1 0 0 0 0 Y

2 a 1 130-140 1 1 0 0 0 N

3 a 2 130-140 1 1 0 1 0 N

4 a 3 140-150 1 0 0 1 1 Y

5 b 2 120-130 1 1 1 1 0 N

6 b 1 110-120 0 0 1 0 0 N

7 b 1 120-130 0 0 1 1 0 N

8 c 1 90-100 1 1 0 0 0 Y

9 a 1 130-140 1 0 0 0 0 Y

10 a 1 130-140 1 0 0 0 0 Y

;

run;

 

data have2;

set have;

format a $20.;

if a1=0 then a1='';

if a2=0 then a2='';

if a3=0 then a3='';

if a4=0 then a4='';

if a5=0 then a5='';

if a1=1 then a1='a1';

if a2=1 then a2='a2';

if a3=1 then a3='a3';

if a4=1 then a4='a4';

if a5=1 then a5='a5';

a = catx('*',a1,a2,a3,a4,a5);

run;

proc freq data=have2;

tables grade*type*score_range* a *bad / missing list;

run;

 

 

I am able to get till here. results of proc freq:

 

grade

typescore_rangeabadFrequency
a1130-140a1Y2
a1130-140a1*a2N1
a1140-150a1Y1
a2130-140a1*a2*a4N1
a3140-150a1*a4*a5Y1
b1110-120a3N1
b1120-130a3*a4N1
b2120-130a1*a2*a3*a4N1
c190-100a1*a2Y1

 

I need to lay this out in a proc report format so i can get to :

BADGradeTypescorerangea1a2a3a4a5a1*a2a1*a3a1*a4a1*a5
YA1140-150         
   130-140         
  2130-140         
  3140-150         
 B1110-120         
   120-130         
  2120-130         
 C190-100         
NA1140-150         
   130-140         
  2130-140         
  3140-150         
 B1110-120         
   120-130         
  2120-130         
 C190-100         

 

Thanks

Solution
‎02-07-2017 04:02 PM
Regular Contributor
Posts: 217

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

Try this

Run the TypeFormat macro declared in http://www2.sas.com/proceedings/sugi29/045-29.pdf, and then run this code

 

proc sort data=have out=srtd_have;
   by id grade type score_range bad;
run;

%TypeFormat(formatname=testtyp,var=a1 a2 a3 a4 a5);
proc summary data=WORK.srtd_have;
   by id grade type score_range bad;
   class a1-a5;
   ways 1 2 3 4 5;
   format _type_ testtyp.;
   output out=work.sum;
run;

 

Once the work.sum table is created, then you can slice&dice it as you want.

 

Hope this helps,

Ahmed

Contributor
Posts: 30

Re: Create freq crosstab matrix and for each combination calculate percentages of target var

Posted in reply to AhmedAl_Attar

@AhmedAl_Attar thanks

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 276 views
  • 2 likes
  • 3 in conversation