I have a data like this
case | IncidentID | Treatment_Date1 | colldate | #VALUE! | Specimen | TestType | Result | |
Alert | 2773233 | 773233 | 7/6/2017 | 7/6/2017 | 0.00 | Urethral | Culture | GC Positive |
Alert | 2783394 | 783394 | 7/17/2017 | 7/17/2017 | 0.00 | Pharyngeal | Culture | GC Positive |
Alert | 2785878 | 785878 | 8/1/2017 | 8/1/2017 | 0.00 | Pharyngeal | Culture | GC Positive |
Alert | 2976906 | 976906 | 11/27/2017 | 11/27/2017 | 0.00 | Urethral | Culture | GC Positive |
Alert | 2976916 | 976916 | 11/29/2017 | 11/29/2017 | 0.00 | Endocervical | Culture | GC Positive |
Alert | 2976916 | 976916 | 11/29/2017 | 11/29/2017 | 0.00 | Pharyngeal | Culture | GC Positive |
Alert | 2979830 | 979830 | 11/29/2017 | 11/29/2017 | 0.00 | Rectal | Culture | GC Positive |
Alert | 2979830 | 979830 | 11/29/2017 | 11/29/2017 | 0.00 | Pharyngeal | Culture | GC Positive |
Alert | 3005182 | 005182 | 12/12/2017 | 12/12/2017 | 0.00 | Urethral | Culture | GC Positive |
Alert | 3005182 | 005182 | 12/12/2017 | 12/12/2017 | 0.00 | Rectal | Culture | GC Positive |
Alert | 3005185 | 005185 | 12/12/2017 | 12/12/2017 | 0.00 | Rectal | Culture | GC Positive |
Alert | 3010768 | 010768 | 12/19/2017 | 1/3/2018 | -15.00 | Urethral | Culture | GC Positive |
Alert | 3010768 | 010768 | 12/19/2017 | 1/3/2018 | -15.00 | Pharyngeal | Culture | GC Positive |
Alert | 3013588 | 013588 | 12/15/2017 | 12/15/2017 | 0.00 | Urethral | Culture | GC Positive |
Alert | 3059401 | 059401 | 1/9/2018 | 1/9/2018 | 0.00 | Urethral | Culture | GC Positive |
Alert | 3101175 | 101175 | 2/22/2018 | 2/22/2018 | 0.00 | Urethral | Culture | GC Positive |
Alert | 3101175 | 101175 | 2/22/2018 | 2/22/2018 | 0.00 | Rectal | Culture | GC Positive |
Alert | 3109493 | 109493 | 2/23/2018 | 2/23/2018 | 0.00 | Endocervical | Culture | GC Positive |
Alert | 3169027 | 169027 | 4/27/2018 | 4/27/2018 | 0.00 | Urethral | Culture | GC Positive |
Alert | 3190088 | 190088 | 5/21/2018 | 5/23/2018 | -2.00 | Urethral | Culture | GC Positive |
Alert | 3238919 | 238919 | 7/24/2018 | 7/24/2018 | 0.00 | Endocervical | Culture | GC Positive |
i need to de-duplicate the incidence id and find out incidence ID with number of anatomic site of infection.
i want to see a table something like this;
id | Anatomic site | number of case |
1 | Urethral | 20 |
2 | rectum | 5 |
3 | urethral & Rectum | 3 |
4 | endocervical and Pharyngeal | 4 |
5 | Throat | 6 |
6 | Cervix | 7 |
I have a presentation tomorrow Please HELP.
So you have six pre defined categories that you want to count the data by, correct?
Does this represent your actual problem? why is number of case = 20 for Anatomic Site=Urethral here?
If you want to be helped (let alone urgently), you must help those willing to help understand the nature of your task and exactly what output you'd like to generate given the sample input you provide. That would be the only way those willing to help would be able to understand what you need, as your verbal description of the task is so cloudy that (as you may have already seen) even the grand masters of discerning the gist of murky OP requirements fail to comprehend what you're trying to do. An output "table something like this", whose data have no apparent correlation with you input data sample, doesn't quite cut it.
Kind regard
Paul D.
My best guess comes to something like:
proc freq data=have order=freq; where result='GC Positive'; tables Specimen /nopercent nocum; run;
I can't tell if you need a data set or not or where the "ID" column values might come from. Perhaps send the output from proc freq above to a data set and then use proc print with the OBS="ID" option on the Print statement.
But you need to provide any rules involved if by "deduplicate" you mean to select a single record per incidentid.
I recognize some of the topic and I am constantly having to ask things about summaries by persons or positive tests or specimen sites. And if there are multiple positive specimen sites and only one is the be counted what hierarchy or rules are used to select them.
And for many reports we use total number of positive tests and total number of persons positive as it provides information related to the multiple positive results.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.