Hi!
I want see how many individuals in my data set have visits 1-7, how many have visits 1-8, and how many have visits 1-9. What would be an efficient way to go about doing this? Thank you all in advance for your help.
proc summary data=have nway;
class id;
var visit;
output out=stats max=max_visit;
run;
Can you show us a portion of your data so we can understand what the data looks like?
Sure. So here is an example of what my data looks like, except my dataset has around 400 observations.
ID Visit
1 1
1 2
1 3
2 1
3 1
3 2
3 3
3 4
3 5
3 6
3 7
4 1
4 2
4 3
5 1
5 2
5 3
5 4
5 5
5 6
5 7
5 8
6 1
6 2
6 3
6 4
6 5
6 6
6 7
6 8
6 9
Assuming your data is always consecutive you only need the last for each ID really.
*get maximum per ID;
proc sql;
create table id_max as
select id, max(visit) as max_visit from have;
quit;
*check distribution;
proc freq data=id_max;
table max_visit;
run;
@zaldarsa wrote:
Sure. So here is an example of what my data looks like, except my dataset has around 400 observations.
ID Visit 1 1 1 2 1 3 2 1 3 1 3 2 3 3 3 4 3 5 3 6 3 7 4 1 4 2 4 3 5 1 5 2 5 3 5 4 5 5 5 6 5 7 5 8 6 1 6 2 6 3 6 4 6 5 6 6 6 7 6 8 6 9
So is it valid to assume that the numbers in the VISIT column are always consecutive within an ID? That the numbers in the VISIT column never repeat or skip?
They are in order, but i didn't double check my data, and it does indeed skip, but no repeat. For example, visit 3 may not be there for some people.
Well don't leave us in suspense. What happens to an ID that doesn't have Visit 3 but has 1, 2, 4, 5, 6? Is that the 1-6 category? Or do you want it to be categorized some other way?
I apologize for my lack of clarity. It would still be visit 1-6 regardless of what is missing in between, so I would still apply a similar code as if it were consecutive , in that case, correct?
proc summary data=have nway;
class id;
var visit;
output out=stats max=max_visit;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.