turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- How to - Get Desired Data Set Output & Report Outp...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2016 09:44 AM - edited 06-05-2016 09:53 AM

Hello everyone,

I would like to ask my question to you. Again, it includes to get the data set output of some procedures. I know, SAS doesn’t let us to get the data set which has alredy had same structure with Report output but I need to create my desired output with some other methods.

I have a sample code as below(My real data is too big). Proc Tabulate almost can provide my desired output but I need some additional columns on my desired output.

**Sample Data Set;**

```
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
```

**Here is my desired output;**

My first problem is to show values as zero even they don’t exist in the data set. For Example, in my sample data set there is no “1” values for "RankVariable" so “PROC TABULATE” doesn’t bring them in its report output, I also need this values as “0”.

Secondly, I need to add cumulative column but this column needs to show the values being percentage values.

**Current Proc Tabulate Statement;**

```
Proc Tabulate Data=Have Out=Have2;
Class RankVariable / Order=Unformatted Missing;
Class Variable1 / Order=Unformatted Missing;
Table RankVariable , Variable1;
Run;
```

I also need this report output as data set. It shouldn’t be same structure, I need the following data set.

**Data Set Desired;**

Actually, this following steps could help me to create the data set but it is too manual, I think there should be more simple method to do the foreigoing demands. At this point, I would like to get your helps.

```
Proc Sql;
Create table ZeroCount AS
Select
Count(Case When Variable1=0 And RankVariable=0 Then 0 Else . End) AS N0
,Count(Case When Variable1=0 And RankVariable=1 Then 1 Else . End) AS N1
,Count(Case When Variable1=0 And RankVariable=2 Then 2 Else . End) AS N2
From Have;
Create Table OneCount AS
Select
Count(Case When Variable1=1 And RankVariable=0 Then 0 Else . End) AS N0
,Count(Case When Variable1=1 And RankVariable=1 Then 1 Else . End) AS N1
,Count(Case When Variable1=1 And RankVariable=2 Then 2 Else . End) AS N2
From Have;
Quit;
PROC SQL;
CREATE TABLE CountTable AS
SELECT * FROM ZEROCOUNT
OUTER UNION CORR
SELECT * FROM ONECOUNT;
Quit;
PROC TRANSPOSE DATA=CountTable
OUT=CountTable2
PREFIX=Col
NAME=Count;
VAR N0 N1 N2;
RUN;
QUIT;
Data Cumulative;
Set CountTable2;
CumulativeColumn + Col2;
Run;
```

Thank you

Accepted Solutions

Solution

06-09-2016
04:08 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-08-2016 10:16 PM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2016 11:49 AM

Hi:

Have you looked into the use of PRELOADFMT with PROC TABULATE? You will need a user-defined format, but then you can show all possible categories even when the category is not present in the data. There are examples in the doc for how to use PRELOADFMT.

cynthia

Have you looked into the use of PRELOADFMT with PROC TABULATE? You will need a user-defined format, but then you can show all possible categories even when the category is not present in the data. There are examples in the doc for how to use PRELOADFMT.

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2016 11:58 AM

Hi:

It seems like this question is related to this previous post: https://communities.sas.com/t5/Base-SAS-Programming/Get-Number-of-Zero-Values-and-Get-Percentage-of-...

was there something about that post that has changed since you marked it as solved?

cynthia

It seems like this question is related to this previous post: https://communities.sas.com/t5/Base-SAS-Programming/Get-Number-of-Zero-Values-and-Get-Percentage-of-...

was there something about that post that has changed since you marked it as solved?

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2016 12:08 PM

Hello Cynthia,

After you mentioned the PRELOADFMT in your previous post , I remembered and checked my previous question which you shared with me as above. Now, I'm checking that post to reach my aim.

Actually, because I used Reeza's code, I didn't investigate PRELOADFMT sample deeply. I'm looking your sample PRELOADFMT code then I will try to create my desired output report and data set by using PRELOADFMT.

Thank you,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-05-2016 10:43 PM

It is more suitable for PROC REPORT.

PROC TABULATE don't have statistical like CUMPCT.

```
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
proc format;
value fmt
0='0'
1='1'
2='2';
run;
proc report data=have nowd completerows ;
columns RankVariable Variable1 CumulativePercent ;
define RankVariable/group format=fmt. preloadfmt;
define Variable1 /across;
define CumulativePercent /computed format=percent8.2;
compute before;
cumpct=0;
sum=sum(_c2_,_c3_);
endcomp;
compute CumulativePercent ;
cumpct+sum(_c2_,_c3_)/sum;
CumulativePercent=cumpct;
endcomp;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 03:26 AM - edited 06-06-2016 03:26 AM

Xia,

Thank you very much for providing foregoing code

But is %44.44 right? shouldn't be it %40.00 instead of %44.44? I'm not sure.

Thank you,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 03:48 AM

But you only have 9 obs , right ?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 04:02 AM

Xia,

For this sample data set, yes. But in my real data set, I have 80.000 obs and it seems to me it takes to much time, I'm still waiting.

Thank you,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 04:11 AM

This could get you a little faster .

```
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
proc format;
value fmt
0='0'
1='1'
2='2';
run;
%let dsid=%sysfunc(open(have));
%let nobs=%sysfunc(attrn(&dsid,nlobs));
%let dsid=%sysfunc(close(&dsid));
proc report data=have nowd completerows ;
columns RankVariable Variable1 CumulativePercent ;
define RankVariable/group format=fmt. preloadfmt;
define Variable1 /across;
define CumulativePercent /computed format=percent8.2;
compute CumulativePercent ;
cumpct+sum(_c2_,_c3_)/&nobs;
CumulativePercent=cumpct;
endcomp;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 05:24 AM

Xia,

Thank you very much for your attention and providing this code.

But it seems that it takes too much time for 80.000 obs. I think for 80.000 obs, it can be better to do the calculation outside of PROC REPORT. What do you think?

Thank you

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 07:15 PM

Cynthia,

Am I writing syntax incorrect?

```
proc format;
value fmt
0="0"
1="1"
2="2";
run;
Proc Tabulate Data=Have Missing Out=Have2;
Class RankVariable Variable1 / Preloadfmt ;
Table RankVariable , Variable1 / Printmiss;
Format RankVariable Variable1 $fmt.;
Run;
```

Thank you

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-06-2016 09:04 PM

Sure.

```
Data Have;
Length Variable1 8 RankVariable 8;
Infile Datalines Missover;
Input Variable1 RankVariable;
Datalines;
0 0
1 2
0 0
1 0
0 2
1 0
1 2
1 2
0 2
;
Run;
data level;
RankVariable=0;output;
RankVariable=1;output;
RankVariable=2;output;
run;
proc sql noprint;
select count(*) into : nobs from have;
create table temp as
select b.RankVariable,a.Variable1
from have as a full join level as b on
a.RankVariable=b.RankVariable;
create table want as
select RankVariable,sum(Variable1=0) as Variable1_0 label='0',
sum(Variable1=1) as Variable1_1 label='1'
from temp
group by RankVariable;
quit;
data want;
set want ;
CumulativePercent+(Variable1_0+Variable1_1)/&nobs;
format CumulativePercent percent8.2;
run;
proc report data=want nowd;
columns RankVariable ('Variable1' Variable1_0 Variable1_1) CumulativePercent ;
define RankVariable /display;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-07-2016 07:30 AM - edited 06-07-2016 07:32 AM

Thank you very much Xia,

Your sample code helped me so much, however I still didn't reach my desired output.

Let's say that my RankVariable has 10 values, to create my desired output, I modified your sample data as below . It is almost done. In the percentage column, I need to add one additional thing to reach my goal.

```
Data Level;
RankVariable =0;Output;
RankVariable =1;Output;
RankVariable =2;Output;
RankVariable =3;Output;
RankVariable =4;Output;
RankVariable =5;Output;
RankVariable =6;Output;
RankVariable =7;Output;
RankVariable =8;Output;
RankVariable =9;Output;
RankVariable =10;Output;
Run;
Proc Sql Noprint;
Select Count(*) Into : Nobs From Have;
Create Table Want As
Select B.RankVariable
,Sum(Variable1=0) As Variable1_0 Label='0'
,Sum(Variable1=1) As Variable1_1 Label='1'
From Have As A
Full Join Level As B
On A.RankVariable =B.RankVariable
Group By B.RankVariable ;
Quit;
Data Want2;
Set Want;
Variable1_0=Lag(Variable1_0);
Variable1_1=Lag(Variable1_1);
If Variable1_0=. Then Variable1_0=0;
If Variable1_1=. Then Variable1_1=0;
Run;
Data Want3;
Set Want2 ;
CumulativePercent+Variable1_1/&Nobs;
Format CumulativePercent Percent8.2;
Run;
Proc Report Data=Want2 Nowd;
Columns RankVariable ('Variable1_1' Variable1_0 Variable1_1) CumulativePercent ;
Define RankVariable / Display;
Run;
```

My current report output,

Desired Report;

PS: I need to get cumulative percentage values just for NPL_F_1, it is "Variable1_1" in my sample data set.

Can somebody still help me?

Thank you

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-07-2016 09:30 AM

Hi:

Since you are creating the Cumulative Percent yourself, you have 2 choices: multiply by 100 in the DATA step where you calculate Cumulative Percent OR do it in PROC REPORT. What is your DEFINE statement for the CumulativePercentage variable. The SAS PERCENT format does an automatic multiply by 100, why not try that? Or, just multiply by 100 in the DATA step if you don't want a Percent sign.

cynthia

Since you are creating the Cumulative Percent yourself, you have 2 choices: multiply by 100 in the DATA step where you calculate Cumulative Percent OR do it in PROC REPORT. What is your DEFINE statement for the CumulativePercentage variable. The SAS PERCENT format does an automatic multiply by 100, why not try that? Or, just multiply by 100 in the DATA step if you don't want a Percent sign.

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-07-2016 09:58 AM

Thank you Cynthia,

So I have two choices,

I can do it in Data step or I can do it in Proc Report.

Actually, Xia prepared codes for both Data Step and Proc Report statements.

Because I have 80.000 obs in my real Data Set, it takes to much time when we use Proc Report statement. So I decided to create in Data Set.

Xia's code seems correct, but there are some diferences between Excel and SAS output. I think the following step needs to change. I multiplied with 100, but it didn't seem correct. I need to make some changes on Cumulative percentage step.

```
Data Want3;
Set Want2 ;
CumulativePercent+Variable1_1/&Nobs;
Format CumulativePercent Percent8.2;
Run;
```

I also tried to use PRELOADFMT. in my one of the previos post, but I got an error.

I'm sorry, I'm little bit confused I'm going to make some search more detailed. Let me check again.

Thank you