Hi,
I want to generate this table:
REGION | SI_PRI | PRI | Return rate |
ATLANTIC | 3570 | 3954 | 90.3% |
HQ | 10991 | 12165 | 90.3% |
ONTARIO | 10301 | 11830 | 87.1% |
PACIFIC | 4232 | 4768 | 88.8% |
PRAIRIES | 5279 | 6297 | 83.8% |
QUEBEC | 4399 | 4950 | 88.9% |
Total | 38772 | 43964 | 88.9% |
this is the code I wrote:
proc sql;
select REGION, count(SI_PRI) as SI_PRI, count (PRI)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
group by REGION;
quit;
this is the output I receive.
REGION | SI_PRI | PRI | Return rate |
ATLANTIC | 3570 | 3954 | 90.3% |
HQ | 10991 | 12165 | 90.3% |
ONTARIO | 10301 | 11830 | 87.1% |
PACIFIC | 4232 | 4768 | 88.8% |
PRAIRIES | 5279 | 6297 | 83.8% |
QUEBEC | 4399 | 4950 | 88.9% |
how can I change my code to get the first table as an output?
Thanks,
Nazanin
Code not tested.
proc sql;
select REGION, count(SI_PRI) as SI_PRI, count (PRI)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
group by REGION
union all
select 'TOTAL', count(SI_PRI) as SI_PRI, count (PRI)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
;
quit;
Usually you don't use SQL to create end user reports.
take a look at PROC TABULATE and PROC REPORT.
PROC SQL is probably not the right tool here. I take it that you want a report and not a data set, correct? See PROC REPORT Documentation for examples of this kind of issue.
This comes up one every 20 posts or so. Please use the search functionality to look before posting. Here is one from yesterday:
for creating the dataset in proc sql try the below code
data have;
input REGION $ SI_PRI PRI Return_rate;
cards;
ATLANTIC 3570 3954 0.903
HQ 10991 12165 0.903
ONTARIO 10301 11830 0.871
PACIFIC 4232 4768 0.888
PRAIRIES 5279 6297 0.838
QUEBEC 4399 4950 0.889
;
proc sql;
create table want as
select 'Total' as region , sum(SI_PRI) as SI_PRI, sum(PRI) as PRI, sum(Return_rate) from have
union
select * from have;
quit;
Code not tested.
proc sql;
select REGION, count(SI_PRI) as SI_PRI, count (PRI)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
group by REGION
union all
select 'TOTAL', count(SI_PRI) as SI_PRI, count (PRI)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
;
quit;
That is just perfect.
It works very well.
Thanks a lot,
Nazanin
I don't think the code from @Ksharp gives correct answers in the presence of missing values for PRI or SI_PRI.
Thanks,
I have missing values in SI-PRI.
why do you think it will not give a correct answer?
Nazanin
Because the count function treats them the same as if the values were not missing.
Ou. I know what Paige means.
If you want include missing value into TOTAL row.
Try:
union all
select 'TOTAL', count(*) as SI_PRI, count (*)as PRI,
calculated SI_PRI/calculated PRI "Return rate" format=percent8.1
from work.Final2
;
the first one was correct.
I want the sum of the above rows!
the second one gives me the total of all rows!
Thanks,
Nazanin
Well, not so fast.
The problem you will have is if either PRI or SI_PRI is missing and the other one is not missing. The end result, the ratio, still will not be correct if this happens.
this is what I get:
REGION | SI_PRI | PRI | Return rate |
ATLANTIC | 3570 | 3954 | 90.3% |
HQ | 10991 | 12165 | 90.3% |
ONTARIO | 10301 | 11830 | 87.1% |
PACIFIC | 4232 | 4768 | 88.8% |
PRAIRIES | 5279 | 6297 | 83.8% |
QUEBEC | 4399 | 4950 | 88.9% |
TOTAL | 38772 | 43964 | 88.2% |
it seems correct 🙂
I always have missing in SI_PRI because of that I calculate the return rate to see how many responded.
Ok, that's great it works for your data.
But for any other data sets and for anyone else reading along, in general, this process of performing statistics in PROC SQL on two different variables and then taking the ratio is not going to work if one variable is missing and the other variable is present. I see this done often to compute weighted averages in SQL, and as stated, missing values are not handled properly and the wrong value results. This is why I always recommend PROC SUMMARY to do weighted averages and the like, because the missings are handled properly. PROC SUMMARY also has the benefit of automatically computing totals for you, and you don't have to write and debug extra SQL code just to get the totals.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.