BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

Let's say that I have a data  set that includes totals for each category and in last row there is grand total.

The task is to calculate percentage of total in each category from the grand total.

(So we need to add a new column called: "PCT" )

 

Data tbl;
input category $  total;
cards;
a  20
b  50
c  80
d  40
e  10
Total 200
;
run;

 

 

4 REPLIES 4
karolis_b
Fluorite | Level 6

Hi!

 

To achieve a desired result there are many different approaches. One of them could be using macro variable:

data _null_;
   set tbl end=end;
   if end then call symput('total',total);
   /* or */
   /*if category = 'Total' then call symput('total',total);*/
run;

data tbl_pct;
   set tbl;
   format pct percent6.2;
   pct = total / &total.;
run;

Good luck!

 

- Karolis

Kurt_Bremser
Super User

You can retrieve the value of the last observation by using the nobs= and point= option:

Data have;
input category $ total;
cards;
a  20
b  50
c  80
d  40
e  10
Total 200
;
run;

data want;
if _n_ = 1 then set have (keep=total rename=(total=grand)) nobs=nobs point=nobs;
set have;
pct = total / grand;
format pct percent8.2;
drop grand;
run;

proc print data=want noobs;
run;

Result:

category    total         pct

 a            20      10.00% 
 b            50      25.00% 
 c            80      40.00% 
 d            40      20.00% 
 e            10       5.00% 
 Total       200      100.0% 

 

Ronein
Meteorite | Level 14

Can you please explain this code:

data want;
if _n_ = 1 then set have (keep=x rename=(x=grand)) nobs=nobs point=nobs;
set have;
run;

 

_n_=1  means that it is going to first observation??

But grand value is in last observation.

I see that in this code there are 2 set statement.

It works perfect but I just dont understand the code

 

 

 

Kurt_Bremser
Super User

_n_ is an automatic variable that holds the count of data step iterations. Here, I use it to execute the first set statement only in iteration one (before anything else is done).

The nobs= and point= options are explained in the documentation of the Set Statement.

Used like this, they do a direct (random access) read of the last observation of the dataset. This prevents a full scan of the dataset, which will improve performance if the dataset is large.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1151 views
  • 1 like
  • 3 in conversation