Hello,
I have some data from groups of people spending money at a number of stores in a mall. I'm trying to calculate the percentage of people who spent at least 25 or more and at least 30 or more using proc report. I've attached the sas file and some of my code so far, but it is pretty basic. I assume I need to use sum and pctsum, but I don't how to incorporate them into the code correctly.
Any help is much appreciated
proc report data=mydata.house;
column id spent;
define id / display;
define spent / display;
title1 'Expenditure per group';
run;
It is better to post data in the form of datastep code such as generated with this: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
SAS data sets from different operating systems may not work, depending on the options in effect a 9.4 data set may not run with 9.2 or 9.3, 32-bit vs 64-bit, lanuage encoding may cause issues. With a data step everyone can run the code on their system to recreate a data set.
When you say:
" I'm trying to calculate the percentage of people who spent at least 25 or more and at least 30 or more using proc report."
is the 25 and 30 you reference a number of purchases (n) a total currency amount or some percent used as a group? When you mention percent you should clearly indicate what a numerator and denominator may be.
If you provide a small data set that you can calculate the result by hand then it is a very good idea to show what you want the output to look like. Otherwise we have to make assumptions and then you get to say "what I really wanted was xxxxx" so start with the XXXX spelled out.
You'll need to provide more context and perhaps some data.
"people who spent at least 25 or more"
I don't know what that means. People who had spending at 25 stores? People who spent $25? People who exhausted themselves (became spent) 25 times?
data spending;
input spending;
datalines;
25
20
15
30
35
40
35
60
;
run;
PROC FORMAT;
VALUE spending_fmt
0-25 = "Spent $25 or less"
26 - HIGH = "Spent more than $25";
RUN;
proc freq data=spending;
tables spending / missing;
FORMAT spending spending_fmt.;
run;
The FREQ Procedure Cumulative Cumulative spending Frequency Percent Frequency Percent Spent $25 or less 3 37.50 3 37.50 Spent more than $25 5 62.50 8 100.00
Thank you very much for your input so far. I'm still working through how to use Mark Jordan's program so I've just posted an example of my code.
To clarify re: percentage, each ID represents a person. The 5 1s represent separate purchases that 1 person made.
i.e. person 1 bought 5 items of varying cost. Person 3 bought 4 items of varying cost.
I would like to calculate the percentage of people who spent more than 25 and more than 30 using proc report.
15.48 | 1 |
10.91 | 1 |
11.47 | 1 |
13.32 | 1 |
14.01 | 1 |
23.08 | 3 |
10.38 | 3 |
14.01 | 3 |
17.98 | 3 |
527.18 | 4 |
625.6 | 4 |
642.27 | 4 |
22.3 | 5 |
48.43 | 5 |
49.48 | 5 |
3.79 | 6 |
2.63 | 6 |
3.46 | 6 |
6.57 | 6 |
16.41 | 6 |
Thanks again
More than $25 in total or had at least one purchase exceeding $25? 100 percent of those ID's spent more than $25 in total. 20 percent of those ID's had at least one expenditure exceeding $25 dollars.
Which is the correct answer?
Edit:
Just for amusement.
data expenditures;
input spending 8.2 id;
datalines;
15.48 1
10.91 1
11.47 1
25.32 1
14.01 1
23.08 3
10.38 3
14.01 3
17.98 3
527.18 4
625.6 4
642.27 4
22.3 5
48.43 5
49.48 5
3.79 6
2.63 6
3.46 6
6.57 6
16.41 6
;
run;
data flagged_expenditures;
set expenditures;
big_spender = 0;
if spending >= 25 then big_spender = 1;
run;
proc sort data = flagged_expenditures;
by id descending big_spender;
run;
data big_spenders;
set flagged_expenditures;
by id;
if first.id;
run;
PROC FORMAT;
VALUE big_spender_fmt
0 = "Did not spend more than $25 on one purchase"
1 = "Spent more than $25 on one purchase";
RUN;
proc freq;
tables big_spender;
format big_spender big_spender_fmt.;
run;
yields
The FREQ Procedure Cumulative Cumulative big_spender Frequency Percent Frequency Percent Did not spend more than $25 on one purchase 2 40.00 2 40.00 Spent more than $25 on one purchase 3 60.00 5 100.00
2nd edit:
And just becasue we can and it amuses me:
proc sql;
create table big_spenders2 as
select 'Big Spender' as status, count(a.id) as mycount
from (
select id, sum(case when spending > 25 then 1 else 0 end) as big_spender
from expenditures
group by id) as a
where a.big_spender >0
union
select 'Little Spender' as status, count(a.id) as mycount
from (
select id, sum(case when spending > 25 then 1 else 0 end) as big_spender
from expenditures
group by id) as a
where a.big_spender = 0;
quit;
proc sql;
create table big_spender_percentage as
select status, mycount / (select sum(mycount) from big_spenders2) as mypercent
from big_spenders2;
quit;
proc print;
run;
yeilds
Obs status mypercent 1 Big Spender 0.6 2 Little Spender 0.4
I'm interested in at least one purchase >25, who made 2 purchases>25 and then who made 3 purchases >25. After that I was planning on calculating it for different amounts (i.e. 30, 50 etc), but I was hoping I'd be able to do that myself.
Thank you very much for your help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.