I have a dataset (generated from proc freq) organized as follows:
data have;
input tickets $ count;
datalines;
0 600
1 278
2 992
;
run;
Two questions:
1) How can I calculate the total number of tickets sold from these output counts?
Total should = 0*600 + 1*278 + 2*992
data want;
length total 8.;
set have;
total = tickets*count;
run;
My current code generates a total by row but not a total for the whole dataset.
2) Is there a way to do these steps iteratively, for multiple different variables (e.g. different concerts) in the same dataset so that the final dataset generated is organized like this?
concert1 total1
concert2 total2
concert3 total3
Thank you.
Why use PROC FREQ at all, if you want sums?
data original;
input id concert1 concert2 concert3;
datalines;
1 1 1 2
2 1 2 2
3 2 0 0
4 0 2 0
5 1 2 0
;
run;
proc summary data=original;
var concert1-concert3;
output out=_sums_ sum=/autoname;
run;
This is a weighted sum. PROC SUMMARY makes this easy
data have;
input tickets count; /* Note: you had tickets as a character variable, I have changed it in this code to numeric */
datalines;
0 600
1 278
2 992
;
proc summary data=have;
var count;
weight tickets;
output out=total sum=total;
run;
Regarding your question 2, it would certainly help for you to show an example of this data set with "multiple different variables (e.g. different concerts)". I suspect you mean you have another variable (which I will call concert_name) and the same variables count and tickets. In this case, adding class concert_name; into the PROC SUMMARY code will get you the results you want. But really, show us the data set that has multiple concerts.
Thank you.
The original dataset looks like this, with one observation per customer (only the first 5 customers displayed), and the number of tickets they purchased for each of 3 different concerts:
data original;
input id concert1 concert2 concert3;
datalines;
1 1 1 2
2 1 2 2
3 2 0 0
4 0 2 0
5 1 2 0
;
run;
dataset have below was generated using proc freq from the first concert (have 1)...
proc freq data=original;
table concert1 / nocum nopercent out=have1;
table concert2 / nocum nopercent out=have2;
table concert3 / nocum nopercent out=have3;
run;
This is very different data than your original post, where you wanted to multiply the number of tickets times a variable named COUNT, and then sum everything up.
Now, it seems like you do not want to multiply the number of tickets times a count and sum everything up. Now it seems like you want to just get a frequency (PROC FREQ) for each variable named CONCERT1 CONCERT2 CONCERT3 ... am I understanding that properly? Please state clearly: What is the desired output from this new data?
Sorry for the confusion.
I put the PROC FREQ lines there to indicate how I generated the first 'have' dataset for a single concert.
The PROC SUMMARY works well to calculate the total number of tickets from a single concert. Is there a way to combine all three total calculations (organized by concert) into a new dataset without running three PROC SUMMARY commands and then manually entering results into a new dataset?
Why use PROC FREQ at all, if you want sums?
data original;
input id concert1 concert2 concert3;
datalines;
1 1 1 2
2 1 2 2
3 2 0 0
4 0 2 0
5 1 2 0
;
run;
proc summary data=original;
var concert1-concert3;
output out=_sums_ sum=/autoname;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.