Hello everyone,
As you know, PROC TABULATE does not bring the values, if the range doesn't include values . For example, in my following sample. there is no value 0.0-0.2 range for this sample.However, it could have values for 0.0-0.2 range. I mean that for this sample, it doesn't include value for the range 0.0-0.2 but for another data set it can includes values.
So I want to show the range without considering whether the range includes values or not.
Here is my sample code
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
;
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN;
Run;
Here is the output of my sample
Here is my desired output
How can I build this structure? Can somebody help me, please?
Thank you
You need MISSING option.
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt missing;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
Use CLASSDATA=
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
0 0.1 0.2-0.4
0 0.3 0.2-0.4
0 0.7 0.4-0.6
0 0.5 0.6-0.8
0 0.9 0.8-1.0
1 0.1 0.2-0.4
1 0.3 0.2-0.4
1 0.7 0.4-0.6
1 0.5 0.6-0.8
1 0.9 0.8-1.0
;
Run;
proc sql;
create table classdata as
select * from
(select distinct Bucket from have union select '0.0-0.2' from have(obs=1)),
(select distinct Predicted from have),
(select distinct Target from have);
quit;
Proc Tabulate Data=Have classdata=classdata;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN/misstext='"-"';
Run;
Thank you very much for trying to help me.
But instead of "0.0-0.2" what the range of "0.2-0.4" was empty, For example, If I change my sample as below, so I need to change your code as below too.
Instead of this method, Can PreLoadFmt format is more efficent than this?
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
Proc sql;
create table classdata as
select * from
(select distinct Bucket from have
union select '0.0-0.2' from have(obs=1)
union select '0.2-0.4' from have(obs=1)
union select '0.4-0.6' from have(obs=1)
union select '0.6-0.8' from have(obs=1)
union select '0.8-1.0' from have(obs=1)),
(select distinct Predicted from have),
(select distinct Target from have);
Quit;
Proc Tabulate Data=Have classdata=classdata;
Var Predicted;
Class Bucket;
Class Target;
Classlev Target;
Classlev Bucket;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN / misstext='"-"';
Run;
Create a format and use PRELOADFMT option.
If you search on here or lexjansen.com you'll find many examples of the implementation.
Thank you very much for trying to help me.
I tried to benefit from following link, but I couldn't succeed, it seems little bit complicated, can you help me to resolve this issue, please?
http://support.sas.com/resources/papers/proceedings11/239-2011.pdf
PROC FORMAT;
Value Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt;
Class Target;
ClassLev Target;
ClassLev Bucket;
/* Format Bucket.;*/
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss;
Run;
Thank you
You need the format applied to bucket variable. It looks like you started but didn't finish that line.
This is also assumes bucket is a continuous variable.
@Reeza,Sometimes like this?
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
1. Are there any errors in your log? Your format doesn't look correct
2. Does it do what you want?
1-There aren't any error in my log. How can I fix my format?
2-Yes, it does.
Can you tell me that how can I write my format correct?
Bucket is a character variable. The values you've defined for the format are numeric.
You need MISSING option.
Data Have;
Length Target 8 Predicted 8 Bucket $ 30;
Infile Datalines Missover;
Input Target Predicted Bucket;
Datalines;
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
0 0.1 0.0-0.2
0 0.3 0.0-0.2
0 0.7 0.4-0.6
0 0.5 0.0-0.2
0 0.9 0.8-1.0
1 0.1 0.0-0.2
1 0.3 0.0-0.2
1 0.7 0.4-0.6
1 0.5 0.0-0.2
1 0.9 0.8-1.0
;
Run;
PROC FORMAT;
Value $Bucket 0.0-0.2="0.0-0.2"
0.2-0.4="0.2-0.4"
0.4-0.6="0.4-0.6"
0.6-0.8="0.6-0.8"
0.8-1.0="0.8-1.0";
Run;
Proc Tabulate Data=Have;
Var Predicted;
Class Bucket / PreLoadFmt missing;
Class Target;
ClassLev Target;
ClassLev Bucket;
Format Bucket $Bucket.;
Table Bucket , Predicted*Mean Target*(N ColPctN) N ColPctN /Printmiss MissText='"-"';
Run;
Okay, I will consider it Xia, thank you 🙂
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.