I have data related to visits within a time period after a certain date. I've set up flags to indicate when a visit and nextvisit occurs. The certain date is random and is associated with an ID. So for example:
ID Certain_Date A Visit NextVisit Class
A 1/1/2015 1 1 1 001
A 3/4/2015 2 1 0 001
B 2/17/2015 1 0 0 002
B 3/01/2015 2 1 1 002
B 5/03/2015 3 0 0 002
B 9/14/2015 4 1 0 002
C 5/25/2015 1 1 1 001
D 2/22/2015 1 1 0 003
D 10/21/2015 2 0 0 003
Each certain date is actually an 'index' episode per ID, and A is a flag for how many index episodes. So A has 2, B has 4, C has 1 and D has 2. Further, each ID belongs to a class.
The goal is to count how many IDs, by class has how many index episodes, mutually exclusive visits and next visits, where A=1 is the first index, A=2 is the second index and A=3 is the third index, and so on.
Class One_index_only Two_index_only Three_index_only Four_index_only
001 1 1 0 0
002 0 0 0 1
003 0 1 0 0
And additionally, how many IDs, by class had a first visit with the first index, a first visit with the second index, or 2 or more first visits by any index (mutually exclusive).
I ended up transposing the data horizontally as such:
ID CLASS A Visit NextVisit A2 Visit2 NextVisit2 A3 Visit3 NextVisit3 A4 Visit4 NextVisit4
A 001 1 1 1 1 1 0 0 0 0 0 0 0
B 002 1 0 0 1 1 1 1 0 0 1 1 0
C 001 1 1 1 0 0 0 0 0 0 0 0 0
D 003 1 1 0 1 0 0 0 0 0 0 0 0
And from there I counted using conditions that go across the columns, for example class 001 has one ID (D) who has only one Visit and no Visit2, Visit3, Visit4. (If Visit=1 and Visit2=0 and Visit3=0 and Visit4=0)
It gets cumbersome however, when there are multiple index episodes (when A = 5 or more), and I need to count for example, between 2-3 valid visits. It would require listing out the combinations: Visit=1 and Visit2=1 and Visit3=0 and Visit4=0 and Visit5=0,
Visit=1 and Visit2=0 and Visit3=1 and Visit4=0 and Visit5=0, etc).
Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?
Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?
Yes. That is if I understand your goal correctly. You want frequencies, by class, of
To do that you need to make one observation per id, with the variables CLASS, NVISITS, and A_for_first_visit_1
data need (keep=id class nvisits A_for_first_visit_1);
set have;
by id;
if last.id then nvisits=a;
retain a_for_first_visit_1;
if first.id then A_for_first_visit_1=.;
if A_for_first_visit_1=. and visit=1 then A_for_first_visit_1=A;
if last.id;
run;
proc tabulate data=need noseps;
class class nvisits A_for_first_visit_1;
table class all,nvisits all;
table class all,a_for_first_visit_1 all;
run;
Note this program assumes (1) that the data set is sorted by id, and (2) there are no "holes" in the sequence of A for each ID (i.e. A=1, then 2, 3, ...).
If you are looking for something like a report and not a dataset this may get you started:
proc tabulate data=have; class class; class a; table class, a='Index count'*n='' /misstext='0' ; run;
It helps to post data in the form of a datastep. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?
Yes. That is if I understand your goal correctly. You want frequencies, by class, of
To do that you need to make one observation per id, with the variables CLASS, NVISITS, and A_for_first_visit_1
data need (keep=id class nvisits A_for_first_visit_1);
set have;
by id;
if last.id then nvisits=a;
retain a_for_first_visit_1;
if first.id then A_for_first_visit_1=.;
if A_for_first_visit_1=. and visit=1 then A_for_first_visit_1=A;
if last.id;
run;
proc tabulate data=need noseps;
class class nvisits A_for_first_visit_1;
table class all,nvisits all;
table class all,a_for_first_visit_1 all;
run;
Note this program assumes (1) that the data set is sorted by id, and (2) there are no "holes" in the sequence of A for each ID (i.e. A=1, then 2, 3, ...).
This definitely gets closer to the solution I was looking for, thank you.
data have; input ID $ Certain_Date : mmddyy10. A Visit NextVisit Class $; format certain_date mmddyy10.; cards; A 1/1/2015 1 1 1 001 A 3/4/2015 2 1 0 001 B 2/17/2015 1 0 0 002 B 3/01/2015 2 1 1 002 B 5/03/2015 3 0 0 002 B 9/14/2015 4 1 0 002 C 5/25/2015 1 1 1 001 D 2/22/2015 1 1 0 003 D 10/21/2015 2 0 0 003 ; run; proc sql; create table temp as select class,id,count(*) as n from have group by class,id; create table temp1 as select class,n,count(*) as count from temp group by class,n; quit; proc transpose data=temp1 out=temp2 prefix=only_; by class; var count; id n; run; proc stdize data=temp2 out=want missing=0 reponly; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.