Hello
I want to create report that will have 6 columns:
Team
Nr_Obs (It is number of Customers)
PCT (Percent of customers from Total )
PCT_Without_A(Percent of customers from Total without included group 'A')
Sum_Wealth (Total wealth)
PCT_Wealth_Without_A(Percent of Wealth from Total without included group 'A')
Nr_Condition (Number of customers that meet specific condion: EU AND Male )
What is the way to do it via:
1-PROC REPORT(If possible)
2-PROC SQL(If possible)
3-PROC TABULATE (If possible)
Thanks
Desired report
Data have;
Input CustID Team $ wealth Area $ Sex $;
cards;
1 A 0 EU M
2 A 0 US M
3 A 0 US M
4 B 2000 EU M
5 B 2000 EU M
6 B 3000 EU M
7 B 4000 US M
8 B 4000 US M
9 C 3000 US M
10 C 2000 US M
11 C 2000 US F
12 C 5000 US F
13 C 10000 US F
14 C 7000 US F
15 C 3000 US F
16 C 2000 EU F
17 C 4000 EU F
18 C 9000 EU F
19 C 2000 US M
20 C 31000 US M
21 D 10000 EU F
22 D 30000 EU M
;
Run;
Data have;
infile cards expandtabs;
Input CustID Team $ wealth Area $ Sex $;
cards;
1 A 0 EU M
2 A 0 US M
3 A 0 US M
4 B 2000 EU M
5 B 2000 EU M
6 B 3000 EU M
7 B 4000 US M
8 B 4000 US M
9 C 3000 US M
10 C 2000 US M
11 C 2000 US F
12 C 5000 US F
13 C 10000 US F
14 C 7000 US F
15 C 3000 US F
16 C 2000 EU F
17 C 4000 EU F
18 C 9000 EU F
19 C 2000 US M
20 C 31000 US M
21 D 10000 EU F
22 D 30000 EU M
;
Run;
proc sql;
create table want as
select Team,count(distinct CustID) as Nr_Obs,
calculated Nr_Obs/(select count(distinct CustID) from have) as PCT format=percent8.2,
case when Team='A' then .
else calculated Nr_Obs/(select count(distinct CustID) from have where Team ne 'A')
end as PCT_Without_A format=percent8.2,
sum(wealth) as Sum_Wealth,
case when Team='A' then .
else calculated Sum_Wealth/(select sum(wealth) from have where Team ne 'A')
end as PCT_Wealth_Without_A format=percent8.2,
(select count(distinct CustID) from have where Team=a.Team and Area='EU' and Sex='M') as Nr_Condition
from have as a
group by Team
union all
select 'TOTAL',(select count(distinct CustID) from have),1,1,(select sum(wealth) from have),1,
(select count(distinct CustID) from have where Area='EU' and Sex='M')
from have(obs=1);
quit;
PROC FREQ will give you the percents and counts. PROC SUMMARY will give you the sums. Run those, and then merge the results by TEAM. Then PROC PRINT gives you the report.
@Ronein wrote:
May you please show how to calculated these 3 variables :
Nr_Obs (It is number of Customers)
PCT (Percent of customers from Total )
This is PROC FREQ, the results wind up in an output data set. Its very simple, give it a try.
PCT_Without_A(Percent of customers from Total without included group 'A')
This is another run of PROC FREQ without group A, the results wind up in another output data set.
Data have;
infile cards expandtabs;
Input CustID Team $ wealth Area $ Sex $;
cards;
1 A 0 EU M
2 A 0 US M
3 A 0 US M
4 B 2000 EU M
5 B 2000 EU M
6 B 3000 EU M
7 B 4000 US M
8 B 4000 US M
9 C 3000 US M
10 C 2000 US M
11 C 2000 US F
12 C 5000 US F
13 C 10000 US F
14 C 7000 US F
15 C 3000 US F
16 C 2000 EU F
17 C 4000 EU F
18 C 9000 EU F
19 C 2000 US M
20 C 31000 US M
21 D 10000 EU F
22 D 30000 EU M
;
Run;
proc sql;
create table want as
select Team,count(distinct CustID) as Nr_Obs,
calculated Nr_Obs/(select count(distinct CustID) from have) as PCT format=percent8.2,
case when Team='A' then .
else calculated Nr_Obs/(select count(distinct CustID) from have where Team ne 'A')
end as PCT_Without_A format=percent8.2,
sum(wealth) as Sum_Wealth,
case when Team='A' then .
else calculated Sum_Wealth/(select sum(wealth) from have where Team ne 'A')
end as PCT_Wealth_Without_A format=percent8.2,
(select count(distinct CustID) from have where Team=a.Team and Area='EU' and Sex='M') as Nr_Condition
from have as a
group by Team
union all
select 'TOTAL',(select count(distinct CustID) from have),1,1,(select sum(wealth) from have),1,
(select count(distinct CustID) from have where Area='EU' and Sex='M')
from have(obs=1);
quit;
Great and thank you,
In order to learn I would like to ask please some questions:
1- Why did you add (OBS=1) ? Will it not working well without it?
Theortically when you wrote obs=1 it means that it will select the first row but I think that anyway the result of second query is one row.
2-I see that for the total row you didnt write the columns names .
Is it not essential in proc sql to write the columns names?
How does it know that the column names of the second query (For total row) are idential to column names of first query ?
Thank you again
I want to ask please-
Why the following query (Without obs=1) produce multiple identical rows?
Is it because you mix summary statistics and fixed values (1)?
Where can I read the theory about it? What should I search in google?
select 'TOTAL',
(select count(distinct CustID) from have),
1,
1,
(select sum(wealth) from have),
1,
(select count(distinct CustID)
from have
where Area='EU' and Sex='M'
from have;
You SELECT FROM have, so the action(s) are done for each observation (record) in have.
But the summary statistics is calclated for all data together
select count(distinct CustID) from have
will produce one number only.
Do you mean that there is "Remerge" ? (Merge summary statistics with raw data )?
A Proc Tabulate solution:
It is often easier to create 1/0 (and missing to exclude) coded variables to use as VAR variables and SUM to get counts and PCTSUM or Mean to get percentage of 1s depending on specific percentage wanted.
Note: your example of Percent wealth without A is going to require missing values for Wealth in some form for the A records, otherwise the count of observations gets used in the percentage calculation.
Data have; Input CustID Team $ wealth Area $ Sex $; WithoutA = (Team ne 'A'); if team ne 'A' then WealthWithoutA= wealth; Condition= (area='EU' and Sex='M'); cards; 1 A 0 EU M 2 A 0 US M 3 A 0 US M 4 B 2000 EU M 5 B 2000 EU M 6 B 3000 EU M 7 B 4000 US M 8 B 4000 US M 9 C 3000 US M 10 C 2000 US M 11 C 2000 US F 12 C 5000 US F 13 C 10000 US F 14 C 7000 US F 15 C 3000 US F 16 C 2000 EU F 17 C 4000 EU F 18 C 9000 EU F 19 C 2000 US M 20 C 31000 US M 21 D 10000 EU F 22 D 30000 EU M ; Run; proc tabulate data=have; class team; var wealth wealthwithoutA withoutA condition; table team=' ' All='Total', n='Nr_obs' pctn='Pct'*f=5.1 withouta=' '*pctsum='Pct_without_A'*f=5.1 wealthwithoutA=' '*(sum='Sum_wealth'*f=8. pctsum='Pct_Weatlh_without_A' *f=5.1) condition=' '*sum='NR_Condition'*f=4. / box='Team' misstext=' ' ; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.