BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dunga
Obsidian | Level 7

/*Hi SAS Forum,

 

I have two datasets. One set provides the income of k number of houses in Jun.

Other set provides the income of p number of houses in May.

My question is further down.

*/

 

/*Jun**/

data Have_Jun;

input Income;

cards;

0

100

105

100

200

300

300

400

500

1000

;

run;

 

 

/*May*/

data Have_May;

input Income;

cards;

0

.

105

100

199

301

302

400

500

1000

125

305

1200

;

run;

 

/*Q: I want to generate a proc tabulate table like below (in fact, I have to produce 75 proc tabulate tables for huge datasets).

 

 IncomeMissing0>0-100>100-200>200-300>300-400>400-500>500-1000>1000sum
JunCOUNT 1222111 10
 PERCENT0.0%10.0%20.0%20.0%20.0%10.0%10.0%10.0%  
MayCOUNT1113 411113
 PERCENT7.69%7.69%7.69%23.08%0.00%30.77%7.69%7.69%7.69% 
Diff -7.69%2.31%12.31%-3.08%20.00%-20.77%2.31%2.31%-7.69% 

 

This table simply presents counts of subjects in each income category for June and May separately.

And it presnets total sums separately.

And it presents corresponding perecentages separately.

And it presents difference in percentages between Jun and May in each income category.

 

/*I have done up to this but I cannot handle the proc trabulate part. Would appreciate any help?*/

/*created a format*/

proc format ;

value val

.= 'Missing'

low-<0 = '<0'

0 = '0'

0<-100 = '>0-100'

100<-200 = '>100-200'

200<-300 = '>200-300'

300<-400 = '>300-400'

400<-500 = '>400-500'

500<-1000 = '>500-1000'

1000<-high = '>1000';

run;

 

/*applied the format for Jun*/

data need_jun;

set have_jun;

income_category = put (income, val.);

Flag='Jun';

run;

 

/*applied the format for May*/

data need_may;

set have_may;

income_category = put (income, val.);

Flag = 'May';

run;

 

/*concatenate 2 tables together*/

data want;

set need_may need_jun;

run;

 

/*Apply a proc tabulate*/

PROC TABULATE DATA=want;

VAR income;

class income_category flag;

TABLE flag, income_category, income*N ;

RUN;

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

If I was right. PROC TABULATE can't do this unstructure summary table.

 

/*Jun**/
data Have_Jun;
input Income;
cards;
0
100
105
100
200
300
300
400
500
1000
;
run;
 
 
/*May*/
data Have_May;
input Income;
cards;
0
.
105
100
199
301
302
400
500
1000
125
305
1200
;
run;
proc format ;
value val
.= 'Missing'
low-<0 = '<0'
0 = '0'
0<-100 = '>0-100'
100<-200 = '>100-200'
200<-300 = '>200-300'
300<-400 = '>300-400'
400<-500 = '>400-500'
500<-1000 = '>500-1000'
1000<-high = '>1000';

value fmt
 -1<-<1=[percentn7.2];
run;
proc freq data=Have_Jun noprint;
table income/out=jun missing;
format income val.;
run;
proc freq data=Have_May noprint;
table income/out=may missing;
format income val.;
run;

data jun;
 set jun;
 retain month 'Jun    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;
data may;
 set may;
 retain month 'May    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;


options validvarname=any missing=0;
proc transpose data=jun out=t_jun;
by month;
var count percent;
id c_income;
idlabel c_income;
run;
proc transpose data=may out=t_may;
by month;
var count percent;
id c_income;
idlabel c_income;
run;

data dif;
 set  t_may(where=(_NAME_='PERCENT')) t_jun(where=(_NAME_='PERCENT')) end=last;
 retain month 'Dif     ';
 call missing(_name_);
 array x{*} _numeric_;
 do i=1 to dim(x);
  x{i}=sum(x{i},-lag(x{i}));
 end;
 if last;
 drop i;
run;

data want;
retain month
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;
 set t_jun t_may dif;
 format _numeric_ fmt.;
 drop _label_;
run;

View solution in original post

4 REPLIES 4
Ksharp
Super User

If I was right. PROC TABULATE can't do this unstructure summary table.

 

/*Jun**/
data Have_Jun;
input Income;
cards;
0
100
105
100
200
300
300
400
500
1000
;
run;
 
 
/*May*/
data Have_May;
input Income;
cards;
0
.
105
100
199
301
302
400
500
1000
125
305
1200
;
run;
proc format ;
value val
.= 'Missing'
low-<0 = '<0'
0 = '0'
0<-100 = '>0-100'
100<-200 = '>100-200'
200<-300 = '>200-300'
300<-400 = '>300-400'
400<-500 = '>400-500'
500<-1000 = '>500-1000'
1000<-high = '>1000';

value fmt
 -1<-<1=[percentn7.2];
run;
proc freq data=Have_Jun noprint;
table income/out=jun missing;
format income val.;
run;
proc freq data=Have_May noprint;
table income/out=may missing;
format income val.;
run;

data jun;
 set jun;
 retain month 'Jun    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;
data may;
 set may;
 retain month 'May    ';
 c_income=put(income,val.);
 percent=percent*0.01;
 format percent percent7.2;
 drop income;
run;


options validvarname=any missing=0;
proc transpose data=jun out=t_jun;
by month;
var count percent;
id c_income;
idlabel c_income;
run;
proc transpose data=may out=t_may;
by month;
var count percent;
id c_income;
idlabel c_income;
run;

data dif;
 set  t_may(where=(_NAME_='PERCENT')) t_jun(where=(_NAME_='PERCENT')) end=last;
 retain month 'Dif     ';
 call missing(_name_);
 array x{*} _numeric_;
 do i=1 to dim(x);
  x{i}=sum(x{i},-lag(x{i}));
 end;
 if last;
 drop i;
run;

data want;
retain month
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;
 set t_jun t_may dif;
 format _numeric_ fmt.;
 drop _label_;
run;
ballardw
Super User

Tabulate will not calculate the difference as shown. Tabulate does no "comparison" of any statistic

You would also want to clarify what your multiple tables may need to look like. If your time frame goes past two months what type of difference is to be shown? Every two months? First and last across the entire period?

 

You should also consider order of appearance of the months. You will run into issues with character values for months sorting and appearing in the correct order: Apr is before May and March in character order. And will your data ever cross calendar years? You may be better off with an actual date value to get those to work and align things.

 

I see this as requiring some summarization and possibly transposition to get all of the elements.

 

 

dunga
Obsidian | Level 7

Hi Ksharp and ballardw,

 

Thanks very much.

 

Ksharp's code is generating the results I expected but in terms of a SAS data set like below.

 

monthMissing<00>0-100>100-200>200-300>300-400>400-500>500-1000>1000_NAME_
Jun0012221110COUNT
Jun0010.00%20.00%20.00%20.00%10.00%10.00%10.00%0PERCENT
May10113 4111COUNT
May7.69%07.69%7.69%23.10% 30.80%7.69%7.69%7.69%PERCENT
Jun-7.69%02.31%12.30%-3.08%20.00%-20.80%2.31%2.31%-7.69% 

 

 

To present these resutls I have to export to Excel which is difficult given I have to produce and export so many two months comparison tables.

 

Would there be any way that I could display the same above SAS dataset resutls as a display (just like proc tabulate generate display tables) without having to export to Excel?

Thanks for your expertise.

Ksharp
Super User
one more thing ,change the order of variables to look the same as your report.

retain month
_NAME_
'Missing'n
'<0'n
'0'n
'>0-100'n
'>100-200'n
'>200-300'n
'>300-400'n
'>400-500'n
'>500-1000'n
'>1000'n;

After that PROC EXPORT can export sas table into Excel file.


About your question, make a macro and CALL EXECUTE() it .
Like:


data x;
input mon $;
cards;
jan
feb
mar
apr
....
;


data _null_;
  set x;
lag_mon=lag(mon);
if _n_ ne 1 then call execute('your-macro-here');
run;




sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1106 views
  • 2 likes
  • 3 in conversation