BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hello,

I am trying to create a dataset that forms variables that are delinquency rates over time. I have created the result I want in the dataset delq_final, using SQL and Proc Transpose. However, I know that I will be doing several different aggregations over different variables, and I'm pretty sure that Proc Report is a more powerful tool for my needs.

I have posted example code below and my question is as follows:

How can I change my Proc Report code to get the results in the dataset delq_final. My Proc Report code runs, but the data is missing. The setup is basically correct, I believe. Thank you for your time and consideration.


data example;
input prod $ type $ dt state $ group1 $ group2 $ num_rec ;
cards;
A XX 1 AA ZZZ VVV 10
A XX 1 AA WWW VVV 20
A XX 1 AA AAA VVV 30
A XX 1 AA ZZZ LLL 40
A XX 1 BB ZZZ VVV 1
A XX 1 BB WWW VVV 2
A XX 1 BB JJJ VVV 3
A XX 1 BB ZZZ LLL 4
A XX 1 CC ZZZ VVV 100
A XX 1 CC WWW XXX 200
A XX 1 CC AAA VVV 300
A XX 1 CC ZZZ LLL 400
A YY 1 AA ZZZ VVV 1000
A YY 1 AA WWW VVV 200
A YY 1 AA AAA VVV 30
A YY 1 AA ZZZ LLL 400
A YY 1 BB ZZZ VVV 1
A YY 1 BB WWW VVV 200
A YY 1 BB JJJ VVV 3
A YY 1 BB ZZZ LLL 400
A YY 1 CC ZZZ VVV 1
A YY 1 CC WWW XXX 2
A YY 1 CC AAA VVV 300
A YY 1 CC ZZZ LLL 40
A XX 2 AA ZZZ VVV 10
A XX 2 AA WWW VVV 20
A XX 2 AA AAA VVV 30
A XX 2 AA ZZZ LLL 40
A XX 2 BB ZZZ VVV 1
A XX 2 BB WWW VVV 2
A XX 2 BB JJJ VVV 3
A XX 2 BB ZZZ LLL 4
A XX 2 CC ZZZ VVV 100
A XX 2 CC WWW XXX 200
A XX 2 CC AAA VVV 300
A XX 2 CC ZZZ LLL 400
A YY 2 AA ZZZ VVV 1000
A YY 2 AA WWW VVV 200
A YY 2 AA AAA VVV 30
A YY 2 AA ZZZ LLL 400
A YY 2 BB ZZZ VVV 1
A YY 2 BB WWW VVV 200
A YY 2 BB JJJ VVV 3
A YY 2 BB ZZZ LLL 400
A YY 2 CC ZZZ VVV 1
A YY 2 CC WWW XXX 2
A YY 2 CC AAA VVV 300
A YY 2 CC ZZZ LLL 40
B XX 1 AA ZZZ VVV 310
B XX 1 AA WWW VVV 20
B XX 1 AA AAA VVV 1230
B XX 1 AA ZZZ LLL 40
B XX 1 BB ZZZ VVV 1
B XX 1 BB WWW VVV 42
B XX 1 BB JJJ VVV 3
B XX 1 BB ZZZ LLL 4
B XX 1 CC ZZZ VVV 100
B XX 1 CC WWW XXX 22200
B XX 1 CC AAA VVV 300
B XX 1 CC ZZZ LLL 400
B YY 1 AA ZZZ VVV 61000
B YY 1 AA WWW VVV 2
B YY 1 AA AAA VVV 30
B YY 1 AA ZZZ LLL 400
B YY 1 BB ZZZ VVV 100
B YY 1 BB WWW VVV 200
B YY 1 BB JJJ VVV 3
B YY 1 BB ZZZ LLL 4
B YY 1 CC ZZZ VVV 1
B YY 1 CC WWW XXX 2
B YY 1 CC AAA VVV 30
B YY 1 CC ZZZ LLL 40
B XX 2 AA ZZZ VVV 199
B XX 2 AA WWW VVV 20
B XX 2 AA AAA VVV 388
B XX 2 AA ZZZ LLL 40
B XX 2 BB ZZZ VVV 1
B XX 2 BB WWW VVV 2
B XX 2 BB JJJ VVV 3
B XX 2 BB ZZZ LLL 8
B XX 2 CC ZZZ VVV 100
B XX 2 CC WWW XXX 6
B XX 2 CC AAA VVV 200
B XX 2 CC ZZZ LLL 400
B YY 2 AA ZZZ VVV 10
B YY 2 AA WWW VVV 200
B YY 2 AA AAA VVV 30
B YY 2 AA ZZZ LLL 40
B YY 2 BB ZZZ VVV 1
B YY 2 BB WWW VVV 22
B YY 2 BB JJJ VVV 3
B YY 2 BB ZZZ LLL 44
B YY 2 CC ZZZ VVV 1
B YY 2 CC WWW XXX 2
B YY 2 CC AAA VVV 32
B YY 2 CC ZZZ LLL 40
;
PROC SQL;
CREATE TABLE DELQ AS
SELECT PROD, TYPE, DT, STATE, SUM(NUM_REC) AS NUM_DELQ
FROM
EXAMPLE
GROUP BY PROD, TYPE, DT, STATE;
CREATE TABLE ALL AS
SELECT PROD, TYPE, DT, SUM(NUM_REC) AS NUM_ALL
FROM
EXAMPLE
GROUP BY PROD, TYPE, DT ;
QUIT;
PROC PRINT DATA=DELQ;
PROC PRINT DATA=ALL;
PROC SORT DATA=DELQ; BY PROD TYPE DT;
PROC SORT DATA=ALL;BY PROD TYPE DT;
DATA DELQ_RESULTS;
MERGE DELQ ALL;BY PROD TYPE DT;
DELQ_RT=NUM_DELQ/NUM_ALL;




proc sort data=delq_results;by prod type dt state;
PROC TRANSPOSE DATA=DELQ_RESULTS(drop=num_delq num_all)
out=delq_results_tran_delq_rt(drop=_NAME_) prefix=delq_rt;
var delq_rt;by prod type dt;id state;

PROC TRANSPOSE DATA=DELQ_RESULTS(drop=delq_rt num_all)
out=delq_results_tran_num_delq(drop=_NAME_) prefix=num_delq;
var num_delq;by prod type dt;id state;

PROC TRANSPOSE DATA=DELQ_RESULTS(drop=delq_rt num_delq)
out=delq_results_tran_num_all(drop=_NAME_) prefix=num_all;
var num_all;by prod type dt;id state;

data delq_final;
merge delq_results_tran_delq_rt delq_results_tran_num_delq delq_results_tran_num_all;
by prod type dt;
run;

*This proc print shows exactly what I would like;

proc print data=delq_results;
proc print data=delq_results_tran_delq_rt;
proc print data=delq_results_tran_num_delq;
proc print data=delq_results_tran_num_all;
proc print data=delq_final;
run;


proc report data=example
nowd out=temp;
column prod type dt state,(num_rec num_all delq_rt);
define prod / group ;
define type / group ;
define dt / group ;
define state/across;
define num_rec / analysis sum 'num_delq';
define num_all/computed 'num_all';
define delq_rt/computed format=percent8.2 'Delq_rt';
compute before;
_c5_=sum(_c4_, _c7_, _c10_);
_c8_=sum(_c4_, _c7_, _c10_);
_c11_=sum(_c4_, _c7_, _c10_);
endcomp;
compute delq_rt;
_c6_=_c4_/_c5_;
_c9_=_c7_/_c8_;
_c12_=_c10_/_c11_;
endcomp;
rbreak after /ol summarize;
run;
1 REPLY 1
Cynthia_sas
SAS Super FREQ
Hi:
Some general comments:

1) COMPUTE BEFORE:
Why do you have the COMPUTE for _C5_, _C8_ and _C11_ in a COMPUTE BEFORE????? The only way that a number will appear on every row for NUM_ALL for every STATE is if you do this:
[pre]
compute num_all;
_c5_=sum(_c4_, _c7_, _c10_);
_c8_=sum(_c4_, _c7_, _c10_);
_c11_=sum(_c4_, _c7_, _c10_);
endcomp;
[/pre]

A COMPUTE BEFORE block only executes 1 time and only 1 time -- at the conceptual "top" of the report. If you need a number on every row for those 3 columns under STATE (and since you're using them for a DIVIDE, it appears that you need them, then the number needs to be available on every REPORT row).

2) Do NOT need COMPUTE for NUM_ALL
Since you need the total number of NUM_REC values, without regard to STATE for the divide, you do not need to COMPUTE NUM_ALL under each STATE. There is a simpler way -- use NUM_REC twice on the report -- one time to get the overall total for the denominator and a second time (with an alias) to get the STATE total for the numerator:
[pre]
proc report data=example nowd out=temp1b;
column prod type dt num_rec state,(num_rec=st_rec delq_rt);
. . . more code . . .
define num_rec / analysis sum 'num_delq';
define st_rec / analysis sum 'state num_rec';
define delq_rt/computed format=percent8.2 'Delq_rt';
compute delq_rt;
_c6_=_c5_/num_rec.sum;
_c8_=_c7_/num_rec.sum;
_c10_=_c9_/num_rec.sum;
endcomp;
[/pre]

Notice the use of NUM_REC alone on the COMPUTE statement, which will give you the total sum of NUM_REC without regard to STATE and then the use of NUM_REC a second time with an ALIAS of ST_REC for usage under each STATE value.

3) You can use NOPRINT in PROC REPORT to "disappear" any columns not needed for display, but which ARE needed for computations. However, when you create an OUTPUT data set from PROC REPORT using OUT=, PROC REPORT writes ALL columns to the output dataset and adds an extra column, _BREAK_, which shows you which report rows it added to the dataset from performing sub-total summarizing or grand total summarizing. If you do not want some of the variables created by PROC REPORT, you will have to DROP them. You might also want to RENAME variables because with ACROSS variables, you will see the _Cn_ absolute column names in the output dataset.

Adjusted code (with correct step boundaries) is shown at the end of the post without the middle SQL/TRANSPOSE section. Several different methods for creating the desired result are shown, with and without NOPRINT. There is a PROC PRINT after each step so you can look at the output dataset created by PROC REPORT.

As an aside -- your code will be easier to read if you do not cram statements together and you use a bit of white space between steps. Using the correct step boundary (like QUIT; for PROC SQL) for other SAS procedures is a good habit to get into because when you start using ODS, only the output that runs BEFORE the ODS destination is closed gets sent to the destination.

In order to run this code, you will have to cut and paste from the forum into a text editor or word processor that respects carriage control/line feed and then cut and paste from the word processor into SAS. (Or else you will see code without any line feeds and the code will be hard to read.)

cynthia
[pre]
data example;
input prod $ type $ dt state $ group1 $ group2 $ num_rec ;
cards;
A XX 1 AA ZZZ VVV 10
A XX 1 AA WWW VVV 20
A XX 1 AA AAA VVV 30
A XX 1 AA ZZZ LLL 40
A XX 1 BB ZZZ VVV 1
A XX 1 BB WWW VVV 2
A XX 1 BB JJJ VVV 3
A XX 1 BB ZZZ LLL 4
A XX 1 CC ZZZ VVV 100
A XX 1 CC WWW XXX 200
A XX 1 CC AAA VVV 300
A XX 1 CC ZZZ LLL 400
A YY 1 AA ZZZ VVV 1000
A YY 1 AA WWW VVV 200
A YY 1 AA AAA VVV 30
A YY 1 AA ZZZ LLL 400
A YY 1 BB ZZZ VVV 1
A YY 1 BB WWW VVV 200
A YY 1 BB JJJ VVV 3
A YY 1 BB ZZZ LLL 400
A YY 1 CC ZZZ VVV 1
A YY 1 CC WWW XXX 2
A YY 1 CC AAA VVV 300
A YY 1 CC ZZZ LLL 40
A XX 2 AA ZZZ VVV 10
A XX 2 AA WWW VVV 20
A XX 2 AA AAA VVV 30
A XX 2 AA ZZZ LLL 40
A XX 2 BB ZZZ VVV 1
A XX 2 BB WWW VVV 2
A XX 2 BB JJJ VVV 3
A XX 2 BB ZZZ LLL 4
A XX 2 CC ZZZ VVV 100
A XX 2 CC WWW XXX 200
A XX 2 CC AAA VVV 300
A XX 2 CC ZZZ LLL 400
A YY 2 AA ZZZ VVV 1000
A YY 2 AA WWW VVV 200
A YY 2 AA AAA VVV 30
A YY 2 AA ZZZ LLL 400
A YY 2 BB ZZZ VVV 1
A YY 2 BB WWW VVV 200
A YY 2 BB JJJ VVV 3
A YY 2 BB ZZZ LLL 400
A YY 2 CC ZZZ VVV 1
A YY 2 CC WWW XXX 2
A YY 2 CC AAA VVV 300
A YY 2 CC ZZZ LLL 40
B XX 1 AA ZZZ VVV 310
B XX 1 AA WWW VVV 20
B XX 1 AA AAA VVV 1230
B XX 1 AA ZZZ LLL 40
B XX 1 BB ZZZ VVV 1
B XX 1 BB WWW VVV 42
B XX 1 BB JJJ VVV 3
B XX 1 BB ZZZ LLL 4
B XX 1 CC ZZZ VVV 100
B XX 1 CC WWW XXX 22200
B XX 1 CC AAA VVV 300
B XX 1 CC ZZZ LLL 400
B YY 1 AA ZZZ VVV 61000
B YY 1 AA WWW VVV 2
B YY 1 AA AAA VVV 30
B YY 1 AA ZZZ LLL 400
B YY 1 BB ZZZ VVV 100
B YY 1 BB WWW VVV 200
B YY 1 BB JJJ VVV 3
B YY 1 BB ZZZ LLL 4
B YY 1 CC ZZZ VVV 1
B YY 1 CC WWW XXX 2
B YY 1 CC AAA VVV 30
B YY 1 CC ZZZ LLL 40
B XX 2 AA ZZZ VVV 199
B XX 2 AA WWW VVV 20
B XX 2 AA AAA VVV 388
B XX 2 AA ZZZ LLL 40
B XX 2 BB ZZZ VVV 1
B XX 2 BB WWW VVV 2
B XX 2 BB JJJ VVV 3
B XX 2 BB ZZZ LLL 8
B XX 2 CC ZZZ VVV 100
B XX 2 CC WWW XXX 6
B XX 2 CC AAA VVV 200
B XX 2 CC ZZZ LLL 400
B YY 2 AA ZZZ VVV 10
B YY 2 AA WWW VVV 200
B YY 2 AA AAA VVV 30
B YY 2 AA ZZZ LLL 40
B YY 2 BB ZZZ VVV 1
B YY 2 BB WWW VVV 22
B YY 2 BB JJJ VVV 3
B YY 2 BB ZZZ LLL 44
B YY 2 CC ZZZ VVV 1
B YY 2 CC WWW XXX 2
B YY 2 CC AAA VVV 32
B YY 2 CC ZZZ LLL 40
;
run; /* DATA step boundary is RUN; */

proc contents data=work.example;
title 'What variables are in work.example';
run; /* PROC CONTENTS has step boundary of RUN; */

proc freq data=work.example;
table state;
run; /* FREQ step boundary is RUN; */

proc report data=example nowd out=temp1a;
column prod type dt state,(num_rec num_all delq_rt);
title '1a) PROC REPORT with correct COMPUTE';
define prod / group ;
define type / group ;
define dt / group ;
define state/across;
define num_rec / analysis sum 'num_delq';
define num_all/computed 'num_all';
define delq_rt/computed format=percent8.2 'Delq_rt';
compute num_all;
_c5_=sum(_c4_, _c7_, _c10_);
_c8_=sum(_c4_, _c7_, _c10_);
_c11_=sum(_c4_, _c7_, _c10_);
endcomp;
compute delq_rt;
_c6_=_c4_/_c5_;
_c9_=_c7_/_c8_;
_c12_=_c10_/_c11_;
endcomp;
rbreak after /ol summarize;
run; /* step boundary for PROC REPORT is RUN; */

proc print data=temp1a;
title 'data set created by PROC REPORT WORK.TEMP1a';
run; /* step boundary for PROC PRINT is RUN: */

proc report data=example nowd out=temp1b;
column prod type dt num_rec state,(num_rec=st_rec delq_rt);
title '1b) Alternate PROC REPORT with alternate COMPUTE';
define prod / group ;
define type / group ;
define dt / group ;
define state/across;
define num_rec / analysis sum 'num_delq';
define st_rec / analysis sum 'state num_rec';
define delq_rt/computed format=percent8.2 'Delq_rt';
compute delq_rt;
_c6_=_c5_/num_rec.sum;
_c8_=_c7_/num_rec.sum;
_c10_=_c9_/num_rec.sum;
endcomp;
rbreak after /ol summarize;
run; /* step boundary for PROC REPORT is RUN; */


proc print data=temp1b;
title 'data set created by PROC REPORT WORK.TEMP1b';
run; /* step boundary for PROC PRINT is RUN: */

proc report data=example nowd out=temp2a;
column prod type dt state,(num_rec num_all delq_rt) ;
title '2a) PROC REPORT with NOPRINT on some columns';
define prod / group ;
define type / group ;
define dt / group ;
define state/across;
define num_rec / analysis sum 'num_delq' noprint;
define num_all/computed 'num_all' noprint;
define delq_rt/computed format=percent8.2 'Delq_rt';
compute num_all;
_c5_=sum(_c4_, _c7_, _c10_);
_c8_=sum(_c4_, _c7_, _c10_);
_c11_=sum(_c4_, _c7_, _c10_);
endcomp;
compute delq_rt;
_c6_=_c4_/_c5_;
_c9_=_c7_/_c8_;
_c12_=_c10_/_c11_;
endcomp;
rbreak after /ol summarize;
run; /* step boundary for PROC REPORT is RUN; */


proc print data=temp2a;
title 'data set created by PROC REPORT WORK.TEMP2a';
run; /* step boundary for PROC PRINT is RUN: */

proc report data=example nowd out=temp2b;
column prod type dt num_rec state,(num_rec=st_rec delq_rt);
title '2b) Alternate PROC REPORT with alternate COMPUTE';
define prod / group ;
define type / group ;
define dt / group ;
define state/across;
define num_rec / analysis sum 'num_delq' noprint;
define st_rec / analysis sum 'state num_rec' noprint;
define delq_rt/computed format=percent8.2 'Delq_rt';
compute delq_rt;
_c6_=_c5_/num_rec.sum;
_c8_=_c7_/num_rec.sum;
_c10_=_c9_/num_rec.sum;
endcomp;
rbreak after /ol summarize;
run; /* step boundary for PROC REPORT is RUN; */

proc print data=temp2b;
title 'data set created by PROC REPORT WORK.TEMP2b';
run; /* step boundary for PROC PRINT is RUN: */
[/pre]

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 1631 views
  • 0 likes
  • 2 in conversation