I have data from 5 study visits that assessed health behaviors. The same questions were asked at each visit and I would like to determine how many unique participants reported one behavior. I have the data sets appended in long format, so I can run a proc freq and get that 110 participants reported the behavior across all visits but I want to know how many of those are unique participants versus the same participants reporting the behavior at each visit. Ideally I would like to know how many participants reported the behaviors at all visits, how many at any visit, and how many reported 1, 2, 3, and 4 visits. I previously asked a similar question but was not able to figure this out. Any suggestions would be greatly appreciated.
A SQL approach:
proc sql;
/* create table want as*/
select pid, behavior, count(behavior) as count
from have
/* where behavior='1'*/
group by pid,behavior
;
quit;
If by long you mean you have data as:
ID VisitDate Behavior or similar:
Then
Data temp;
set have;
behaviorcount = (behavior='behavior value of interest');
run;
proc summary data=temp nway;
class ID;
var behaviorcount;
output out=want sum=;
run;
will have the number of visits with the specified behavior across all visits for each ID. ID could be several variables that uniquely identify the patient.
You can also use either count distinct in a proc sql or use a double proc freq - first time include the participant ID, the second you don't.
For more help please post sample data and expected output that corresponds to your sample data.
I am not sure I understand the code from ballardw and I tried count distinct but that doesn't seem to give me what I want either. I can't attach an excel right now for some reason but below is a short example of what I would want to do. I just made up the example data. I would want to know how many people did the behavior at 1 visit, at 2 visits, at 3 visits, etc. so from the data the results would be this, 1=yes, 2= no in the code for behavior:
behaviorcount
0=1 (did not report)
1=1
2=3
3=1
4=0
5=1
pid | behavior | visit |
1 | 0 | 1 |
1 | 1 | 2 |
1 | 1 | 3 |
1 | 0 | 4 |
1 | 0 | 5 |
2 | 0 | 1 |
2 | 0 | 2 |
2 | 0 | 3 |
2 | 0 | 4 |
3 | 1 | 1 |
3 | 1 | 2 |
3 | 1 | 3 |
4 | 1 | 1 |
4 | 0 | 2 |
4 | 0 | 3 |
4 | 0 | 4 |
4 | 0 | 5 |
5 | 1 | 1 |
5 | 1 | 2 |
5 | 1 | 3 |
5 | 1 | 4 |
5 | 1 | 5 |
6 | 0 | 1 |
6 | 1 | 2 |
6 | 1 | 3 |
6 | 0 | 4 |
6 | 0 | 5 |
7 | 1 | 1 |
7 | 1 | 2 |
With that coding:
proc means data=have sum;
class pid;
var behavior;
run;
If 1 represents have the behavior then the sum would be how many visits exhibited the behavior for each PID value.
A SQL approach:
proc sql;
/* create table want as*/
select pid, behavior, count(behavior) as count
from have
/* where behavior='1'*/
group by pid,behavior
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.