Hi Community,
I have the following hypothetical dataset:
Record_id | Study_visit | TestA |
1 | 1 | Positive |
1 | 2 | Negative |
2 | 1 | Negative |
2 | 1 | Negative |
2 | 2 | Negative |
3 | 1 | Negative |
3 | 2 | Positive |
4 | 1 | Negative |
4 | 2 | Positive |
4 | 2 | Negative |
I would like to determine the frequency of the “TestA” variable by study visit; however, the issue is that some subjects have multiple values for a study visit (see Record_id=”2”, study_visit=1; Record_id=”4”, study_visit=2). In these cases, I wish to only have one count per subject, such that the denominator reflects the number of unique subjects for that study visit. Also, if there is both a “Negative” and a “Positive” value for TestA for the same study visit (see Record_id=”4”, study_visit=2), I would like for “Negative” value to be the one that is counted. Please see my desired frequency outputs below. Could someone please help me with coding this? Any help would be much appreciated!
Study_visit=1
| TestA | ||
| Frequency | Cumulative frequency | Percent |
Positive | 1 | 1 | 25.00 (1/4) |
Negative | 3 | 4 | 75.00 (3/4) |
Study_visit=2
| TestA | ||
| Frequency | Cumulative frequency | Percent |
Positive | 1 | 1 | 25.00 (1/4) |
Negative | 3 | 4 | 75.00 (3/4) |
Use the fact that 'Negative" sorts before "Positive" to clean up the data:
proc sort data=have; by record_id study_visit testA; run;
data want;
set have; by record_id study_visit;
if first.study_visit;
run;
data have;
input Record_id Study_visit TestA $;
cards;
1 1 Positive
1 2 Negative
2 1 Negative
2 1 Negative
2 2 Negative
3 1 Negative
3 2 Positive
4 1 Negative
4 2 Positive
4 2 Negative
;
proc sql;
create table want as
select Study_visit,testa,count(record_id) as c
from (select distinct * from have group by Record_id,Study_visit
having count(distinct TestA)>1 and testa='Negative' or count(distinct TestA)=1)
group by Study_visit,testa;
quit;
/*Or just*/
proc sql;
create view temp as
select distinct *
from have
group by Record_id,Study_visit
having count(distinct TestA)>1 and testa='Negative' or count(distinct TestA)=1;
quit;
proc freq data=temp;
tables Study_visit*testa/list;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.