BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

Let's say that I have a data  set that includes totals for each category and in last row there is grand total.

The task is to calculate percentage of total in each category from the grand total.

(So we need to add a new column called: "PCT" )

 

Data tbl;
input category $  total;
cards;
a  20
b  50
c  80
d  40
e  10
Total 200
;
run;

 

 

4 REPLIES 4
karolis_b
Fluorite | Level 6

Hi!

 

To achieve a desired result there are many different approaches. One of them could be using macro variable:

data _null_;
   set tbl end=end;
   if end then call symput('total',total);
   /* or */
   /*if category = 'Total' then call symput('total',total);*/
run;

data tbl_pct;
   set tbl;
   format pct percent6.2;
   pct = total / &total.;
run;

Good luck!

 

- Karolis

Kurt_Bremser
Super User

You can retrieve the value of the last observation by using the nobs= and point= option:

Data have;
input category $ total;
cards;
a  20
b  50
c  80
d  40
e  10
Total 200
;
run;

data want;
if _n_ = 1 then set have (keep=total rename=(total=grand)) nobs=nobs point=nobs;
set have;
pct = total / grand;
format pct percent8.2;
drop grand;
run;

proc print data=want noobs;
run;

Result:

category    total         pct

 a            20      10.00% 
 b            50      25.00% 
 c            80      40.00% 
 d            40      20.00% 
 e            10       5.00% 
 Total       200      100.0% 

 

Ronein
Meteorite | Level 14

Can you please explain this code:

data want;
if _n_ = 1 then set have (keep=x rename=(x=grand)) nobs=nobs point=nobs;
set have;
run;

 

_n_=1  means that it is going to first observation??

But grand value is in last observation.

I see that in this code there are 2 set statement.

It works perfect but I just dont understand the code

 

 

 

Kurt_Bremser
Super User

_n_ is an automatic variable that holds the count of data step iterations. Here, I use it to execute the first set statement only in iteration one (before anything else is done).

The nobs= and point= options are explained in the documentation of the Set Statement.

Used like this, they do a direct (random access) read of the last observation of the dataset. This prevents a full scan of the dataset, which will improve performance if the dataset is large.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 942 views
  • 1 like
  • 3 in conversation