BookmarkSubscribeRSS Feed
Bal23
Lapis Lazuli | Level 10

I use proc freq to generate numbers. Then, in steading of print out the dataset and calculate later by myself, I want SAS to do extra calculation, to calculate the sum of number of each column, and generate another table, the percentage of each number divided by the sum*100. That is, to have two variables, sum and rate.

My sample data is attached

Would you please recommend a sas code?

Thanks

12 REPLIES 12
ballardw
Super User

Please show the desired output for the example data you show in the attachment.

Also, does the output have to be a data set or are you looking for a report table.

Bal23
Lapis Lazuli | Level 10

I have regenerated sample data and output in an excel file attached

i am interested in both generating a dataset and a report table.

 

Bal23
Lapis Lazuli | Level 10

if you cannot open it, here is the pdf file for the output, but the original one is the left one

ballardw
Super User

Also, if your example data is the result from proc freq it would probably be better to provide the raw data as the the types of sums and rates your are looking for are possible from various report procedures.

Reeza
Super User

Proc freq generates percents, rows or columns for your data as well. The output should contain on of the numbers you're looking for, and you might be better off trying to manipulate that output rather than calculate information from scratch. 

 

RTM for options on the TABLES statement.

Steelers_In_DC
Barite | Level 11

Not very elegant but here is a solution:

 

data have;
infile cards dsd dlm=' ';
input obs condition$ year2010 year2011 year2012;
cards;
1 a 12 23 45
2 b 23 45 57
3 c 45 57 89
4 d 57 89 20
5 e 89 20 29
6 f 20 29 59
;run;

proc transpose data=have out=tran_have(rename=(_NAME_=Year));by obs condition;var year:;

proc sql;
create table sum as
select *,sum(col1) as sum
from tran_have
group by year
order by condition;

proc transpose data=sum out=tran_sum;by condition;id year;var sum;

proc sql;
create table want as
select a.year2010/b.year2010*100 as year2010,
       a.year2011/b.year2011*100 as year2011,
       a.year2012/b.year2012*100 as year2012
from have a left join
     tran_sum b on
a.condition=b.condition;

ballardw
Super User

And another using data in the first post for input:

/* first get some data to manipulate*/
data junk;
   input obs condition $  Yr2010 Yr2011  Yr201Yr2 Yr2013 Yr2014 Yr2015 Yr2016;
   array y  Yr2010 Yr2011  Yr201Yr2 Yr2013 Yr2014 Yr2015 Yr2016;
   /* this reshapes the data into something that proc tabulate 
   likes better*/
   do i= 1 to dim(y);
      year= 2009+i;
      value = y[i];
      output;
   end;
   keep obs condition year value;
datalines;
1 a                  31     7      98             41     68    41     68
2 b                  93     28     99             32     53    32     53
3 c                  66     2      33             68     44    68     44
4 d                  93     80     58             50     41    50     41
5 e                  74     11     93             73     39    73     39
6 f                  13     71     45             35     39    35     39
7 g                  72     63     61             45     42    45     42
8 h                  57     53     64             53     16    53     16
9 i                   8     14     81             21     92    21     92
;
run;

proc tabulate data=junk;
   class obs condition year;
   var value;
   table obs*condition all='Sum', 
         year='' * value=''*sum*f=best8. 
         year='' * value=''*colpctsum='Rate'*f=f16.8;
run;

If the original data used for proc freq consisted of Condition Year and a variable being counted the above might be sufficient changing SUM to N and colpctsum to colpctn and the Value could be left out. Maybe.

 

Bal23
Lapis Lazuli | Level 10

, thank you.

I got the error message

 


ERROR: There are multiple analysis variables associated with a single table cell in the following
nesting : category * value * year * value * sum * f.

 

actually the real data is not year from 2010 to 2016, it is like female2010, female2011, male2010, male2011, trans2010, trans2011, unknow2010, unknow2011

 

Would you please provide an adjusted code. And there is no need for including ID. Thanks. It is ordered by frequency. that is already done prevoiusly

just keep the order

Bal23
Lapis Lazuli | Level 10

Thank you. What is there are some missing values in some blanks. Any adjustment for that?

is "DSD" for this purpose?

ballardw
Super User

"Missing values in some blanks" implies data related issues but we need more infomation to see what you mean. Such the input data and the code applied.

 

DSD is only relavent when reading data from or writing to an external file. When reading a file that is delimited, when the delimiter occurs encolsed by quotation marks the delimiter is ignored and 2 consecutive delimiters indicate a missing value.

 

 

 

 

Bal23
Lapis Lazuli | Level 10

Thanks. I have several errors when i run this code.

1. at first, i did not set a variable name as "year"

your code is year2011 2012 etc

so the error was

ERROR: Variable YEAR not found.
2. Then I set my variable name as "year",  although it is not perfect, because it is really not simple year. the first row is more about different catogies with different year. My task is like calculate "child2003" "child2004" "adult2003" "adult2004" "senior2003" "senior2004" so it is not a simple year but a mix of year and categories

my another variable is "category" instead of "condition",

 

then the error showed below

 

ERROR: Data set WORK.HAVE is not sorted in ascending sequence. The current BY group has Category = Pi
and the next BY group has Category = Is.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 3 observations read from the data set WORK.HAVE.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 2483 views
  • 0 likes
  • 4 in conversation