Hello, I have a dataset that contains 4 variables. "ID" is a unique identifier of a person in my dataset. "Location Type" is a type of location that the person is in, and only ranges from 1 to 3. "Screen" is if they were ever medically screened. "Status" has 3 entries, positive, negative, or blank. It is blank when the patient has not been screened. Here is the sample dataset:
data JANFEB.SAMPLEUNIQUE;
infile datalines dsd truncover;
input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.;
datalines4;
1,1,Yes,Positive
2,2,No,
3,3,Yes,Negative
4,1,No,
5,2,Yes,Positive
6,3,No,
7,1,Yes,Negative
8,2,No,
9,3,Yes,Positive
10,1,No,
11,1,Yes,Negative
12,2,Yes,Positive
13,3,Yes,Negative
14,1,No,
;;;;
How do I create a table from this dataset that groups screening and status results in this following way?
Location Type | Screened | Positive | Total | Screening Rate | Positive Prevalence |
1 | 30 | 2 | 100 | 30% | 7% |
2 | 40 | 3 | 200 | 20% | 8% |
3 | 50 | 5 | 100 | 50% | 10% |
Here are the explanation of the columns:
- Screened: Unique count under each location type where the screened variable had 'Yes' in it.
- Positive: Unique count under each location type where the status variable had 'Positive' in it.
- Total: Unique count under each location type in the dataset.
- Screening rate: Screened/Total
- Positive prevalence; Positive/Screen
Thank you so much for your help!
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique; infile datalines dsd truncover; input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.; datalines4; 1,1,Yes,Positive 2,2,No, 3,3,Yes,Negative 4,1,No, 5,2,Yes,Positive 6,3,No, 7,1,Yes,Negative 8,2,No, 9,3,Yes,Positive 10,1,No, 11,1,Yes,Negative 12,2,Yes,Positive 13,3,Yes,Negative 14,1,No, ;;;; run; proc sql; create table WANT as select LOCATIONTYPE, sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED, sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE, count(*) as TOTAL, (CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE, (CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE from SAMPLEUNIQUE group by LOCATIONTYPE; quit;
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique; infile datalines dsd truncover; input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.; datalines4; 1,1,Yes,Positive 2,2,No, 3,3,Yes,Negative 4,1,No, 5,2,Yes,Positive 6,3,No, 7,1,Yes,Negative 8,2,No, 9,3,Yes,Positive 10,1,No, 11,1,Yes,Negative 12,2,Yes,Positive 13,3,Yes,Negative 14,1,No, ;;;; run; proc sql; create table WANT as select LOCATIONTYPE, sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED, sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE, count(*) as TOTAL, (CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE, (CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE from SAMPLEUNIQUE group by LOCATIONTYPE; quit;
You're right. The output doesn't match those numbers. I just arbitrarily put in numbers for the sake of creating a sample output of what I would like it to look like. Your code works! Thanks so much.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.