Hello everyone,
I would like to ask my question to you. Again, it includes to get the data set output of some procedures. I know, SAS doesn’t let us to get the data set which has alredy had same structure with Report output but I need to create my desired output with some other methods.
I have a sample code as below(My real data is too big). Proc Tabulate almost can provide my desired output but I need some additional columns on my desired output.
Sample Data Set;
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
Here is my desired output;
My first problem is to show values as zero even they don’t exist in the data set. For Example, in my sample data set there is no “1” values for "RankVariable" so “PROC TABULATE” doesn’t bring them in its report output, I also need this values as “0”.
Secondly, I need to add cumulative column but this column needs to show the values being percentage values.
Current Proc Tabulate Statement;
Proc Tabulate Data=Have Out=Have2;
Class RankVariable / Order=Unformatted Missing;
Class Variable1 / Order=Unformatted Missing;
Table RankVariable , Variable1;
Run;
I also need this report output as data set. It shouldn’t be same structure, I need the following data set.
Data Set Desired;
Actually, this following steps could help me to create the data set but it is too manual, I think there should be more simple method to do the foreigoing demands. At this point, I would like to get your helps.
Proc Sql;
Create table ZeroCount AS
Select
Count(Case When Variable1=0 And RankVariable=0 Then 0 Else . End) AS N0
,Count(Case When Variable1=0 And RankVariable=1 Then 1 Else . End) AS N1
,Count(Case When Variable1=0 And RankVariable=2 Then 2 Else . End) AS N2
From Have;
Create Table OneCount AS
Select
Count(Case When Variable1=1 And RankVariable=0 Then 0 Else . End) AS N0
,Count(Case When Variable1=1 And RankVariable=1 Then 1 Else . End) AS N1
,Count(Case When Variable1=1 And RankVariable=2 Then 2 Else . End) AS N2
From Have;
Quit;
PROC SQL;
CREATE TABLE CountTable AS
SELECT * FROM ZEROCOUNT
OUTER UNION CORR
SELECT * FROM ONECOUNT;
Quit;
PROC TRANSPOSE DATA=CountTable
OUT=CountTable2
PREFIX=Col
NAME=Count;
VAR N0 N1 N2;
RUN;
QUIT;
Data Cumulative;
Set CountTable2;
CumulativeColumn + Col2;
Run;
Thank you
Here is : Data Have; Length Variable1 8 RankVariable 8; Infile Datalines Missover; Input Variable1 RankVariable; Datalines; 0 0 1 2 0 0 1 0 0 2 1 0 1 2 1 2 0 2 ; Run; Data Level; RankVariable =0;Output; RankVariable =1;Output; RankVariable =2;Output; Run; Proc Sql ; Create Table Want As Select Data1.*, Sum(Data1.Variable1_1) As Total From ( select b.RankVariable,coalesce(a.Variable1_0,0) as Variable1_0 ,coalesce(a.Variable1_1,0) as Variable1_1 from (Select RankVariable ,Sum(Variable1=0) As Variable1_0 Label='0' ,Sum(Variable1=1) As Variable1_1 Label='1' From Have Group By RankVariable) as a right join level as b on a.RankVariable =b.RankVariable ) AS Data1; Quit; Data Want2; Set Want; Lag_Variable1_0=Lag(Variable1_0); Lag_Variable1_1=Lag(Variable1_1); If Lag_Variable1_0=. Then Lag_Variable1_0=0; If Lag_Variable1_1=. Then Lag_Variable1_1=0; Cumulative+Variable1_1; CumulativePercent=(Cumulative/Total); Run;
Hello Cynthia,
After you mentioned the PRELOADFMT in your previous post , I remembered and checked my previous question which you shared with me as above. Now, I'm checking that post to reach my aim.
Actually, because I used Reeza's code, I didn't investigate PRELOADFMT sample deeply. I'm looking your sample PRELOADFMT code then I will try to create my desired output report and data set by using PRELOADFMT.
Thank you,
It is more suitable for PROC REPORT.
PROC TABULATE don't have statistical like CUMPCT.
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
proc format;
value fmt
0='0'
1='1'
2='2';
run;
proc report data=have nowd completerows ;
columns RankVariable Variable1 CumulativePercent ;
define RankVariable/group format=fmt. preloadfmt;
define Variable1 /across;
define CumulativePercent /computed format=percent8.2;
compute before;
cumpct=0;
sum=sum(_c2_,_c3_);
endcomp;
compute CumulativePercent ;
cumpct+sum(_c2_,_c3_)/sum;
CumulativePercent=cumpct;
endcomp;
run;
Xia,
Thank you very much for providing foregoing code 🙂
But is %44.44 right? shouldn't be it %40.00 instead of %44.44? I'm not sure.
Thank you,
Xia,
For this sample data set, yes. But in my real data set, I have 80.000 obs and it seems to me it takes to much time, I'm still waiting.
Thank you,
This could get you a little faster .
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
proc format;
value fmt
0='0'
1='1'
2='2';
run;
%let dsid=%sysfunc(open(have));
%let nobs=%sysfunc(attrn(&dsid,nlobs));
%let dsid=%sysfunc(close(&dsid));
proc report data=have nowd completerows ;
columns RankVariable Variable1 CumulativePercent ;
define RankVariable/group format=fmt. preloadfmt;
define Variable1 /across;
define CumulativePercent /computed format=percent8.2;
compute CumulativePercent ;
cumpct+sum(_c2_,_c3_)/&nobs;
CumulativePercent=cumpct;
endcomp;
run;
Xia,
Thank you very much for your attention and providing this code.
But it seems that it takes too much time for 80.000 obs. I think for 80.000 obs, it can be better to do the calculation outside of PROC REPORT. What do you think?
Thank you
Cynthia,
Am I writing syntax incorrect?
proc format;
value fmt
0="0"
1="1"
2="2";
run;
Proc Tabulate Data=Have Missing Out=Have2;
Class RankVariable Variable1 / Preloadfmt ;
Table RankVariable , Variable1 / Printmiss;
Format RankVariable Variable1 $fmt.;
Run;
Thank you
Sure.
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
data level;
RankVariable=0;output;
RankVariable=1;output;
RankVariable=2;output;
run;
proc sql noprint;
select count(*) into : nobs from have;
create table temp as
select b.RankVariable,a.Variable1
from have as a full join level as b on
a.RankVariable=b.RankVariable;
create table want as
select RankVariable,sum(Variable1=0) as Variable1_0 label='0',
sum(Variable1=1) as Variable1_1 label='1'
from temp
group by RankVariable;
quit;
data want;
set want ;
CumulativePercent+(Variable1_0+Variable1_1)/&nobs;
format CumulativePercent percent8.2;
run;
proc report data=want nowd;
columns RankVariable ('Variable1' Variable1_0 Variable1_1) CumulativePercent ;
define RankVariable /display;
run;
Thank you very much Xia,
Your sample code helped me so much, however I still didn't reach my desired output.
Let's say that my RankVariable has 10 values, to create my desired output, I modified your sample data as below . It is almost done. In the percentage column, I need to add one additional thing to reach my goal.
Data Level;
RankVariable =0;Output;
RankVariable =1;Output;
RankVariable =2;Output;
RankVariable =3;Output;
RankVariable =4;Output;
RankVariable =5;Output;
RankVariable =6;Output;
RankVariable =7;Output;
RankVariable =8;Output;
RankVariable =9;Output;
RankVariable =10;Output;
Run;
Proc Sql Noprint;
Select Count(*) Into : Nobs From Have;
Create Table Want As
Select B.RankVariable
,Sum(Variable1=0) As Variable1_0 Label='0'
,Sum(Variable1=1) As Variable1_1 Label='1'
From Have As A
Full Join Level As B
On A.RankVariable =B.RankVariable
Group By B.RankVariable ;
Quit;
Data Want2;
Set Want;
Variable1_0=Lag(Variable1_0);
Variable1_1=Lag(Variable1_1);
If Variable1_0=. Then Variable1_0=0;
If Variable1_1=. Then Variable1_1=0;
Run;
Data Want3;
Set Want2 ;
CumulativePercent+Variable1_1/&Nobs;
Format CumulativePercent Percent8.2;
Run;
Proc Report Data=Want2 Nowd;
Columns RankVariable ('Variable1_1' Variable1_0 Variable1_1) CumulativePercent ;
Define RankVariable / Display;
Run;
My current report output,
Desired Report;
PS: I need to get cumulative percentage values just for NPL_F_1, it is "Variable1_1" in my sample data set.
Can somebody still help me?
Thank you
Thank you Cynthia,
So I have two choices,
I can do it in Data step or I can do it in Proc Report.
Actually, Xia prepared codes for both Data Step and Proc Report statements.
Because I have 80.000 obs in my real Data Set, it takes to much time when we use Proc Report statement. So I decided to create in Data Set.
Xia's code seems correct, but there are some diferences between Excel and SAS output. I think the following step needs to change. I multiplied with 100, but it didn't seem correct. I need to make some changes on Cumulative percentage step.
Data Want3;
Set Want2 ;
CumulativePercent+Variable1_1/&Nobs;
Format CumulativePercent Percent8.2;
Run;
I also tried to use PRELOADFMT. in my one of the previos post, but I got an error.
I'm sorry, I'm little bit confused 😞 I'm going to make some search more detailed. Let me check again.
Thank you
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.