Hello everyone,
As you know, PROC TABULATE does not bring the values, if the range doesn't include values . For example, in my following sample. there is no value 0.0-0.2 range for this sample.However, it could have values for 0.0-0.2 range. I mean that for this sample, it doesn't include value for the range 0.0-0.2 but for another data set it can includes values.
So I want to show the range without considering whether the range includes values or not.
Here is my sample code
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
;
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN;
Run;
Here is the output of my sample
Here is my desired output
How can I build this structure? Can somebody help me, please?
Thank you
You need MISSING option.
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt missing;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
Use CLASSDATA=
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
;
Run;
proc sql;
create table classdata as
select * from
(select distinct Bucket from have union select '0.0-0.2' from have(obs=1)),
(select distinct Predicted from have),
(select distinct Target from have);
quit;
Proc Tabulate Data=Have classdata=classdata;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN/misstext='"-"';
Run;
Thank you very much for trying to help me.
But instead of "0.0-0.2" what the range of "0.2-0.4" was empty, For example, If I change my sample as below, so I need to change your code as below too.
Instead of this method, Can PreLoadFmt format is more efficent than this?
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
Proc sql;
create table classdata as
select * from
(select distinct Bucket from have
union select '0.0-0.2' from have(obs=1)
union select '0.2-0.4' from have(obs=1)
union select '0.4-0.6' from have(obs=1)
union select '0.6-0.8' from have(obs=1)
union select '0.8-1.0' from have(obs=1)),
(select distinct Predicted from have),
(select distinct Target from have);
Quit;
Proc Tabulate Data=Have classdata=classdata;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN / misstext='"-"';
Run;
Create a format and use PRELOADFMT option.
If you search on here or lexjansen.com you'll find many examples of the implementation.
Thank you very much for trying to help me.
I tried to benefit from following link, but I couldn't succeed, it seems little bit complicated, can you help me to resolve this issue, please?
http://support.sas.com/resources/papers/proceedings11/239-2011.pdf
PROC FORMAT;
Value Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt;
Class Target;
ClassLev Target;
ClassLev Bucket;
/* Format Bucket.;*/
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss;
Run;
Thank you
You need the format applied to bucket variable. It looks like you started but didn't finish that line.
This is also assumes bucket is a continuous variable.
@Reeza,Sometimes like this?
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
1. Are there any errors in your log? Your format doesn't look correct
2. Does it do what you want?
1-There aren't any error in my log. How can I fix my format?
2-Yes, it does.
Can you tell me that how can I write my format correct?
Bucket is a character variable. The values you've defined for the format are numeric.
You need MISSING option.
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt missing;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
Okay, I will consider it Xia, thank you 🙂
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.