This question branches off from an earlier one that I received a solution on from Tom.
The earlier problem involved formatting a date then getting a distinct count of cases by date. The code was
Proc sql; create Table SASCHART.Setup_Case as /*Setup_Contact */ select input(day_mon,date9.) as date format=yymmdd10. , count(distinct CaseID) as n_cases /*, count(distinct Contact_Person_ID) as n_contacts*/ from SASCHART.Date_Fix_Data group by 1 ;
Producing data for a time series graph on which was added another variable time series.
The output dataset has two fields
Date n_cases
2020-27-02 1
2020-28-02 5
2020-15-06 345
.
2021-23-09 2435
and so on - adding up all cases for each date
There are 362K rows in SASCHART.Date_Fix_Data. After running the above code to get counts by distinct CaseID the resultant dataset has 603 rows.
What I need to do is utilize the "From dataset SASCHART.Date_Fix_Data" in the query which has a field for County which looks like
Date CaseID County
2020-27-02 3452435 Jackson
.
.
2021-07-07 3425727 Polk
2021-07-07 3425363 Polk
and so on.
What I want is to count the distinct caseIDs this time not by date but by County to get something like
County Cases_County_count
Baker 25
Columbia 344
Malheur 115
Hood River 554
.
.
Washington 11,098
So the counts are the count of distinct CaseID by distinct county. I don't need the date as a field in this data set.
I have tried to just add county to the SAS code shown
Proc sql; create table SASCHART.Setup_Case_County AS select input(day_mon,date9.) as date format=yymmdd10. , count(distinct CaseID) as n_cases /*, count(distinct county) as county*/ from SASCHART.Date_Fix_Data group by county;
I commented the line that I added. However, when I do it this way I get a count for each record (so the data set instead of being the number of counties is 365K again. Looking like
Date n_cases County
2021-05-08 2520 1
and so on
What I would like is what was shown above
County Cases_County_count
Baker 25
Columbia 344
Malheur 115
Hood River 554
How do I work the query to produce the needed output dataset.
Thank you.
wlierman
It would help if you would always provide sample data created via a SAS data step and also show us the desired result. Love it that you're referencing an previous discussion but would be great if you'd also post the link to it.
Below should work if I understand it right what you're after.
proc sql;
create table saschart.setup_case_county as
select
county
, count(distinct caseid) as n_cases
from saschart.date_fix_data
group by county
;
quit;
It would help if you would always provide sample data created via a SAS data step and also show us the desired result. Love it that you're referencing an previous discussion but would be great if you'd also post the link to it.
Below should work if I understand it right what you're after.
proc sql;
create table saschart.setup_case_county as
select
county
, count(distinct caseid) as n_cases
from saschart.date_fix_data
group by county
;
quit;
Patrick
Thank you for the assistance.
wlierman
It's simple count by distinct case_id:
select county, count(distinct case_id) as count
from .....
group by county
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.