Help using Base SAS procedures

Proc tabulate challenge?

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Proc tabulate challenge?

/*Hi SAS Forum,

 

I have two datasets. One set provides the income of k number of houses in Jun.

Other set provides the income of p number of houses in May.

My question is further down.

*/

 

/*Jun**/

data Have_Jun;

input Income;

cards;

0

100

105

100

200

300

300

400

500

1000

;

run;

 

 

/*May*/

data Have_May;

input Income;

cards;

0

.

105

100

199

301

302

400

500

1000

125

305

1200

;

run;

 

/*Q: I want to generate a proc tabulate table like below (in fact, I have to produce 75 proc tabulate tables for huge datasets).

 

 IncomeMissing0>0-100>100-200>200-300>300-400>400-500>500-1000>1000sum
JunCOUNT 1222111 10
 PERCENT0.0%10.0%20.0%20.0%20.0%10.0%10.0%10.0%  
MayCOUNT1113 411113
 PERCENT7.69%7.69%7.69%23.08%0.00%30.77%7.69%7.69%7.69% 
Diff -7.69%2.31%12.31%-3.08%20.00%-20.77%2.31%2.31%-7.69% 

 

This table simply presents counts of subjects in each income category for June and May separately.

And it presnets total sums separately.

And it presents corresponding perecentages separately.

And it presents difference in percentages between Jun and May in each income category.

 

/*I have done up to this but I cannot handle the proc trabulate part. Would appreciate any help?*/

/*created a format*/

proc format ;

value val

.= 'Missing'

low-<0 = '<0'

0 = '0'

0<-100 = '>0-100'

100<-200 = '>100-200'

200<-300 = '>200-300'

300<-400 = '>300-400'

400<-500 = '>400-500'

500<-1000 = '>500-1000'

1000<-high = '>1000';

run;

 

/*applied the format for Jun*/

data need_jun;

set have_jun;

income_category = put (income, val.);

Flag='Jun';

run;

 

/*applied the format for May*/

data need_may;

set have_may;

income_category = put (income, val.);

Flag = 'May';

run;

 

/*concatenate 2 tables together*/

data want;

set need_may need_jun;

run;

 

/*Apply a proc tabulate*/

PROC TABULATE DATA=want;

VAR income;

class income_category flag;

TABLE flag, income_category, income*N ;

RUN;


Accepted Solutions
Solution
‎08-21-2017 01:48 PM
Super User
Posts: 10,787

Re: Proc tabulate challenge?

If I was right. PROC TABULATE can't do this unstructure summary table.

 

/*Jun**/
data Have_Jun;
input Income;
cards;
0
100
105
100
200
300
300
400
500
1000
;
run;
 
 
/*May*/
data Have_May;
input Income;
cards;
0
.
105
100
199
301
302
400
500
1000
125
305
1200
;
run;
proc format ;
value val
.= 'Missing'
low-<0 = '<0'
0 = '0'
0<-100 = '>0-100'
100<-200 = '>100-200'
200<-300 = '>200-300'
300<-400 = '>300-400'
400<-500 = '>400-500'
500<-1000 = '>500-1000'
1000<-high = '>1000';

value fmt
 -1<-<1=[percentn7.2];
run;
proc freq data=Have_Jun noprint;
table income/out=jun missing;
format income val.;
run;
proc freq data=Have_May noprint;
table income/out=may missing;
format income val.;
run;

data jun;
 set jun;
 retain month 'Jun    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;
data may;
 set may;
 retain month 'May    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;


options validvarname=any missing=0;
proc transpose data=jun out=t_jun;
by month;
var count percent;
id c_income;
idlabel c_income;
run;
proc transpose data=may out=t_may;
by month;
var count percent;
id c_income;
idlabel c_income;
run;

data dif;
 set  t_may(where=(_NAME_='PERCENT')) t_jun(where=(_NAME_='PERCENT')) end=last;
 retain month 'Dif     ';
 call missing(_name_);
 array x{*} _numeric_;
 do i=1 to dim(x);
  x{i}=sum(x{i},-lag(x{i}));
 end;
 if last;
 drop i;
run;

data want;
retain month
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;
 set t_jun t_may dif;
 format _numeric_ fmt.;
 drop _label_;
run;

View solution in original post


All Replies
Solution
‎08-21-2017 01:48 PM
Super User
Posts: 10,787

Re: Proc tabulate challenge?

If I was right. PROC TABULATE can't do this unstructure summary table.

 

/*Jun**/
data Have_Jun;
input Income;
cards;
0
100
105
100
200
300
300
400
500
1000
;
run;
 
 
/*May*/
data Have_May;
input Income;
cards;
0
.
105
100
199
301
302
400
500
1000
125
305
1200
;
run;
proc format ;
value val
.= 'Missing'
low-<0 = '<0'
0 = '0'
0<-100 = '>0-100'
100<-200 = '>100-200'
200<-300 = '>200-300'
300<-400 = '>300-400'
400<-500 = '>400-500'
500<-1000 = '>500-1000'
1000<-high = '>1000';

value fmt
 -1<-<1=[percentn7.2];
run;
proc freq data=Have_Jun noprint;
table income/out=jun missing;
format income val.;
run;
proc freq data=Have_May noprint;
table income/out=may missing;
format income val.;
run;

data jun;
 set jun;
 retain month 'Jun    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;
data may;
 set may;
 retain month 'May    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;


options validvarname=any missing=0;
proc transpose data=jun out=t_jun;
by month;
var count percent;
id c_income;
idlabel c_income;
run;
proc transpose data=may out=t_may;
by month;
var count percent;
id c_income;
idlabel c_income;
run;

data dif;
 set  t_may(where=(_NAME_='PERCENT')) t_jun(where=(_NAME_='PERCENT')) end=last;
 retain month 'Dif     ';
 call missing(_name_);
 array x{*} _numeric_;
 do i=1 to dim(x);
  x{i}=sum(x{i},-lag(x{i}));
 end;
 if last;
 drop i;
run;

data want;
retain month
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;
 set t_jun t_may dif;
 format _numeric_ fmt.;
 drop _label_;
run;
Super User
Posts: 13,583

Re: Proc tabulate challenge?

Tabulate will not calculate the difference as shown. Tabulate does no "comparison" of any statistic

You would also want to clarify what your multiple tables may need to look like. If your time frame goes past two months what type of difference is to be shown? Every two months? First and last across the entire period?

 

You should also consider order of appearance of the months. You will run into issues with character values for months sorting and appearing in the correct order: Apr is before May and March in character order. And will your data ever cross calendar years? You may be better off with an actual date value to get those to work and align things.

 

I see this as requiring some summarization and possibly transposition to get all of the elements.

 

 

Contributor
Posts: 39

Re: Proc tabulate challenge?

Hi Ksharp and ballardw,

 

Thanks very much.

 

Ksharp's code is generating the results I expected but in terms of a SAS data set like below.

 

monthMissing<00>0-100>100-200>200-300>300-400>400-500>500-1000>1000_NAME_
Jun0012221110COUNT
Jun0010.00%20.00%20.00%20.00%10.00%10.00%10.00%0PERCENT
May10113 4111COUNT
May7.69%07.69%7.69%23.10% 30.80%7.69%7.69%7.69%PERCENT
Jun-7.69%02.31%12.30%-3.08%20.00%-20.80%2.31%2.31%-7.69% 

 

 

To present these resutls I have to export to Excel which is difficult given I have to produce and export so many two months comparison tables.

 

Would there be any way that I could display the same above SAS dataset resutls as a display (just like proc tabulate generate display tables) without having to export to Excel?

Thanks for your expertise.

Super User
Posts: 10,787

Re: Proc tabulate challenge?

one more thing ,change the order of variables to look the same as your report.

retain month
_NAME_
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;

After that PROC EXPORT can export sas table into Excel file.


About your question, make a macro and CALL EXECUTE() it .
Like:


data x;
input mon $;
cards;
jan
feb
mar
apr
....
;


data _null_;
  set x;
lag_mon=lag(mon);
if _n_ ne 1 then call execute('your-macro-here');
run;




☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 307 views
  • 2 likes
  • 3 in conversation