BookmarkSubscribeRSS Feed
aya_h
Calcite | Level 5

New to SAS -- I know this is probably silly to most of you but it's taken me hours. I have a dataset that looks like this:

 

COMMODITYCOUNTRYQUANTITY1QUANTITY2QUANTITY3
010ARGENTINA283
010ANDORRA3102
010TOTAL FOR ALL COUNTRIES5185
020BELGIUM10630
020RUSSIA15420
020TOTAL FOR ALL COUNTRIES251050

 

There are many commodity codes, countries, and quantities, so doing it manually would be a pain. I am trying to use Quantity 1 values as a base quantity (so the others should be untouched) to generate a percentage report to see which country generates what % of world total quantity for a given commodity code.

 

Ideally that would be represented in a different column like so:

 

COMMODITYCOUNTRYQUANTITY1% of totalQUANTITY2QUANTITY3
010ARGENTINA240%83
010ANDORRA360%102
010TOTAL FOR ALL COUNTRIES5100%185
020BELGIUM1550%630
020RUSSIA1550%420
020TOTAL FOR ALL COUNTRIES30100%1050

 

Later in my program, the rows that do not exceed 3% will be omitted from the final dataset, so I need this column to be permanent.

 

Thank you so much!

2 REPLIES 2
SASJedi
SAS Super FREQ

How about something like this?

data have;
infile datalines dsd dlm='|';
input COMMODITY	COUNTRY:$25. QUANTITY1	QUANTITY2	QUANTITY3;
datalines;
010|ARGENTINA|2|8|3
010|ANDORRA|3|10|2
010|TOTAL FOR ALL COUNTRIES|5|18|5
020|BELGIUM|10|6|30
020|RUSSIA|15|4|20
020|TOTAL FOR ALL COUNTRIES|25|10|50
;

proc sql;
create table want as
select have.*, quantity1/total as Pct '% of Total' format=percent7.
	from have
		inner join
	(select commodity, quantity1 as Total from have where Country = 'TOTAL FOR ALL COUNTRIES') as tot
	on have.commodity=tot.commodity
;
select * from want;
quit;

 Result:

COMMODITY COUNTRY QUANTITY1 QUANTITY2 QUANTITY3 % of Total
10 ARGENTINA 2 8 3 40%
10 ANDORRA 3 10 2 60%
10 TOTAL FOR ALL COUNTRIES 5 18 5 100%
20 BELGIUM 10 6 30 40%
20 RUSSIA 15 4 20 60%
20 TOTAL FOR ALL COUNTRIES 25 10 50 100%
Check out my Jedi SAS Tricks for SAS Users
ballardw
Super User

First, please provide example data in the form of data step code pasted into a text or code box  opened by clicking on the </> or "running man" icon above the message box. Example:

data have;
   input commodity $ Country :$10. quantity1 quantity2 quantity3;
datalines;
010	ARGENTINA	2	8	3
010	ANDORRA	3	10	2
020	BELGIUM	10	6	30
020	RUSSIA	15	4	20
;

Second, including intermediate summary rows like your Total for commodity actually makes it harder in many respects because that row of data needs to be treated differently. So I have excluded those.

 

If I understand what you want this may be one way (using the above data set)

proc tabulate data=have out=example;
   class commodity country;
   var quantity: ;
   tables commodity,
         (country All='All countries'),
          quantity: *(sum='Number'*f=best5. pagepctsum ='% of total')
   ;
run;

The use of Quantity: above is a list generator and will use all the variables whose names start with quantity.

The above code creates a table for each commodity in the results window. The Out= option on the proc statement does create a data set with the summary values though it has a bit of extra info.

The Country value will be blank where where the _type_ variable is 10, which means that it is the "All countries" row.

Proc Tabulate requires a statistic for your quantity variable and Sum is the choice so the All countries row has a total. The output data set will have these in the Quantity1_sum, Quantity2_sum (continue the pattern).

The percents will be in variables named Quantity1_pctsum_10_quantity1 (the 10 is the same as the _type_ to show which total was used as the denominator).

 

Since you didn't provide any example of data where the % do not exceed 3 or show what you expect to create when that occurs I can't help past that.

It may be because of "delete row" rule that this might make an easier to use data set:

proc tabulate data=have out=example;
   class commodity country;
   var quantity: ;
   tables commodity,
         (country All='All countries'),
          quantity1 *(sum='Number'*f=best5. pagepctsum ='% of total')
   ;
   tables commodity,
         (country All='All countries'),
          quantity2 *(sum='Number'*f=best5. pagepctsum ='% of total')
   ;
   tables commodity,
         (country All='All countries'),
          quantity3 *(sum='Number'*f=best5. pagepctsum ='% of total')
   ;
run;

As each quantity variable will have a separate block of values (rows) to work with.

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 344 views
  • 2 likes
  • 3 in conversation