Hello, I have a dataset that contains 4 variables. "ID" is a unique identifier of a person in my dataset. "Location Type" is a type of location that the person is in, and only ranges from 1 to 3. "Screen" is if they were ever medically screened. "Status" has 3 entries, positive, negative, or blank. It is blank when the patient has not been screened. Here is the sample dataset:
data JANFEB.SAMPLEUNIQUE;
infile datalines dsd truncover;
input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.;
datalines4;
1,1,Yes,Positive
2,2,No,
3,3,Yes,Negative
4,1,No,
5,2,Yes,Positive
6,3,No,
7,1,Yes,Negative
8,2,No,
9,3,Yes,Positive
10,1,No,
11,1,Yes,Negative
12,2,Yes,Positive
13,3,Yes,Negative
14,1,No,
;;;;How do I create a table from this dataset that groups screening and status results in this following way?
| Location Type | Screened | Positive | Total | Screening Rate | Positive Prevalence |
| 1 | 30 | 2 | 100 | 30% | 7% |
| 2 | 40 | 3 | 200 | 20% | 8% |
| 3 | 50 | 5 | 100 | 50% | 10% |
Here are the explanation of the columns:
- Screened: Unique count under each location type where the screened variable had 'Yes' in it.
- Positive: Unique count under each location type where the status variable had 'Positive' in it.
- Total: Unique count under each location type in the dataset.
- Screening rate: Screened/Total
- Positive prevalence; Positive/Screen
Thank you so much for your help!
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique;
infile datalines dsd truncover;
input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.;
datalines4;
1,1,Yes,Positive
2,2,No,
3,3,Yes,Negative
4,1,No,
5,2,Yes,Positive
6,3,No,
7,1,Yes,Negative
8,2,No,
9,3,Yes,Positive
10,1,No,
11,1,Yes,Negative
12,2,Yes,Positive
13,3,Yes,Negative
14,1,No,
;;;;
run;
proc sql;
create table WANT as
select LOCATIONTYPE,
sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED,
sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE,
count(*) as TOTAL,
(CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE,
(CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE
from SAMPLEUNIQUE
group by LOCATIONTYPE;
quit;
Your test data doesn't match your example output, or if it does, eplain how a count of 3 id's can yield 30? Anyways this should get you started:
data sampleunique;
infile datalines dsd truncover;
input ID:BEST. LocationType:BEST. Screen:$3. Result:$8.;
datalines4;
1,1,Yes,Positive
2,2,No,
3,3,Yes,Negative
4,1,No,
5,2,Yes,Positive
6,3,No,
7,1,Yes,Negative
8,2,No,
9,3,Yes,Positive
10,1,No,
11,1,Yes,Negative
12,2,Yes,Positive
13,3,Yes,Negative
14,1,No,
;;;;
run;
proc sql;
create table WANT as
select LOCATIONTYPE,
sum(case when SCREEN="Yes" then 1 else 0 end) as SCREENED,
sum(case when char(RESULT,1)="P" then 1 else 0 end) as POSITIVE,
count(*) as TOTAL,
(CALCULATED SCREENED / CALCULATED TOTAL) * 100 as SCREENINGRATE,
(CALCULATED POSITIVE / CALCULATED SCREENED) * 100 as POSITIVEPREVALENCE
from SAMPLEUNIQUE
group by LOCATIONTYPE;
quit;
You're right. The output doesn't match those numbers. I just arbitrarily put in numbers for the sake of creating a sample output of what I would like it to look like. Your code works! Thanks so much.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.