- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please see the below code that generates a cross tab dataset in the end.
I need the cross output as shown in the final dataset.
Is there a better way to produce the dataset as you can see the column
names have spaces and some of the status lengths can be more.
I've created an example to replicate my situation.
I need a dataset output as below.
Col1 Col2 (Status 1) Col3 (Status 2) Col4 (Status 3)
Status 1 Count Count Count
Status 2 Count Count Count
Status 3 Count Count Count
/* BASE table used, so that all statuses can be captured and reported*/
data all_status;
length status $80;
status="uptodate";output;
status="1 month in arrears";output;
status="2 month in arrears";output;
status="3 month in arrears";output;
status="Fraud";output;
run;
data last_month;
length account_no 8 status $80;
account_no=1001; status="uptodate"; output;
account_no=1002; status="1 month in arrears";output;
account_no=1003; status="2 month in arrears";output;
account_no=1004; status="3 month in arrears";output;
account_no=1005; status="uptodate"; output;
account_no=1006; status="1 month in arrears";output;
account_no=1007; status="1 month in arrears";output;
account_no=1008; status="2 month in arrears";output;
run;
data last_month;
set last_month(rename=(status=status_last));
run;
proc sort data=last_month; by account_no;run;
data this_month;
length account_no 8 status $80 ;
account_no=1001; status="1 month in arrears"; output;
account_no=1002; status="2 month in arrears";output;
account_no=1003; status="3 month in arrears";output;
account_no=1004; status="2 month in arrears";output;
account_no=1005; status="uptodate"; output;
account_no=1006; status="1 month in arrears";output;
account_no=1007; status="2 month in arrears";output;
account_no=1008; status="Fraud";output;
run;
data this_month;
set this_month(rename=(status=status_this));
run;
proc sort data=this_month; by account_no;run;
/* JOINED TO GET ALL COMBINATIONS TO BE REPORTED */
proc sql;
create table all_comb as
select
a.status as status_last
,b.status as status_this
from all_status a
cross join all_status b
order by status_last, status_this
;
quit;
data merged;
merge last_month (in=a)
this_month (in=b)
;
by account_no;
if a and b;
run;
proc sql;
create table counts as
select
status_last, status_this, count(status_this) as count
from merged
group by status_last, status_this
order by status_last, status_this
;
quit;
data counts_all;
merge all_comb (in=a)
counts (in=b)
;
by status_last status_this;
if a or b;
if count = . then count=0;
run;
proc transpose data=counts_all out=final(drop=_Name_) label=status_this ;
by status_last;
id status_this;
var count;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Sanjay_M wrote:
Is there a better way to produce the dataset as you can see the column
names have spaces and some of the status lengths can be more.
Could you please clarify your question? Are you saying you want column names to have spaces, or are you saying you don't want column names to have spaces?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The input current is only saying "1 month in arrears" etc. which is short. But in reality there could be names that could be longer so would PROC TRANSPOSE be able to handle lengthy names to create the dataset that has these as column names?
Is this the right way to have column names transposed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Sanjay_M wrote:
The input current is only saying "1 month in arrears" etc. which is short. But in reality there could be names that could be longer so would PROC TRANSPOSE be able to handle lengthy names to create the dataset that has these as column names?
Column names in SAS can be a maximum of 32 characters and cannot have spaces (unless you use the VALIDVARNAME=ANY option in which case it can have spaces).
Can you provide example data of these possibilities of longer text strings?
Is this the right way to have column names transposed?
It seems to me that you are avoiding better solutions, such as PROC PRINT, PROC TABULATE and PROC REPORT where you have more control over the output appearance. It's also not clear to me why you aren't using PROC FREQ or PROC TRANSPOSE or PROC REPORT to do the calculations, but are instead going through this complicated sequence of steps.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have added couple of comments to the original post for clarity.
The input dataset has the structure as I would be getting.
The descriptions currently are only of statuses but it would be other category descriptions like "Payments missed in the month etc". Do't have visibility yet of all the descriptions.
I am looking to get a CROSSTAB output as dataset, can you please suggest how PROC TABULATE or FREQ could be used to get the output as a dataset. Don't want to use PROC REPORT as it will produce report.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC REPORT can provide data set outputs.
But I still remain confused as to the actual output you want, as I asked in my first reply. Could you please be specific?
Can you also please create a mock-up of some realistic data that illustrates the problem of potentially longer strings of text?
In other words, I don't really need your code. I need input data, and desired output.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not sure I follow what you are trying to do but why not just use the current value as the LABEL on the variable and create some new variable names that you generate. That way the names can be short and unique and follow SAS rules for valid names.