Hello, I have a dataset that contains 4 variables. "ID" is a unique identifier of a person in my dataset. "Location Type" is a type of location that the person is in, and only ranges from 1 to 3. "Screen" is if they were ever medically screened. "Status" has 3 entries, positive, negative, or blank. It is blank when the patient has not been screened. Here is the sample dataset:
data JANFEB.SAMPLEUNIQUE;
infile datalines dsd truncover;
input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.;
datalines4;
1,1,Yes,Positive
2,2,No,
3,3,Yes,Negative
4,1,No,
5,2,Yes,Positive
6,3,No,
7,1,Yes,Negative
8,2,No,
9,3,Yes,Positive
10,1,No,
11,1,Yes,Negative
12,2,Yes,Positive
13,3,Yes,Negative
14,1,No,
;;;;
How do I create a table from this dataset that groups screening and status results in this following way?
Location Type | Screened | Positive | Total | Screening Rate | Positive Prevalence |
1 | 30 | 2 | 100 | 30% | 7% |
2 | 40 | 3 | 200 | 20% | 8% |
3 | 50 | 5 | 100 | 50% | 10% |
Here are the explanation of the columns:
- Screened: Unique count under each location type where the screened variable had 'Yes' in it.
- Positive: Unique count under each location type where the status variable had 'Positive' in it.
- Total: Unique count under each location type in the dataset.
- Screening rate: Screened/Total
- Positive prevalence; Positive/Screen
Thank you so much for your help!
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique; infile datalines dsd truncover; input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.; datalines4; 1,1,Yes,Positive 2,2,No, 3,3,Yes,Negative 4,1,No, 5,2,Yes,Positive 6,3,No, 7,1,Yes,Negative 8,2,No, 9,3,Yes,Positive 10,1,No, 11,1,Yes,Negative 12,2,Yes,Positive 13,3,Yes,Negative 14,1,No, ;;;; run; proc sql; create table WANT as select LOCATIONTYPE, sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED, sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE, count(*) as TOTAL, (CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE, (CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE from SAMPLEUNIQUE group by LOCATIONTYPE; quit;
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique; infile datalines dsd truncover; input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.; datalines4; 1,1,Yes,Positive 2,2,No, 3,3,Yes,Negative 4,1,No, 5,2,Yes,Positive 6,3,No, 7,1,Yes,Negative 8,2,No, 9,3,Yes,Positive 10,1,No, 11,1,Yes,Negative 12,2,Yes,Positive 13,3,Yes,Negative 14,1,No, ;;;; run; proc sql; create table WANT as select LOCATIONTYPE, sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED, sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE, count(*) as TOTAL, (CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE, (CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE from SAMPLEUNIQUE group by LOCATIONTYPE; quit;
You're right. The output doesn't match those numbers. I just arbitrarily put in numbers for the sake of creating a sample output of what I would like it to look like. Your code works! Thanks so much.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.