BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
appleorange
Obsidian | Level 7

I have data related to visits within a time period after a certain date.  I've set up flags to indicate when a visit and nextvisit occurs.  The certain date is random and is associated with an ID.  So for example:

 

ID     Certain_Date     A      Visit   NextVisit     Class

A      1/1/2015            1         1        1             001

A      3/4/2015            2         1        0             001

B      2/17/2015          1         0        0             002

B      3/01/2015          2         1        1             002

B      5/03/2015          3         0        0             002

B      9/14/2015          4         1        0             002

C      5/25/2015          1         1        1             001

D      2/22/2015          1         1        0             003

D      10/21/2015        2         0        0             003

 

Each certain date is actually an 'index' episode per ID, and A is a flag for how many index episodes.  So A has 2, B has 4, C has 1 and D has 2.  Further, each ID belongs to a class.

 

The goal is to count how many IDs, by class has how many index episodes, mutually exclusive visits and next visits, where A=1 is the first index, A=2 is the second index and A=3 is the third index, and so on.

 

Class   One_index_only  Two_index_only  Three_index_only  Four_index_only

001           1                               1                         0                               0

002           0                               0                         0                               1

003           0                               1                         0                               0

 

And additionally, how many IDs, by class had a first visit with the first index, a first visit with the second index, or 2 or more first visits by any index (mutually exclusive).

 

I ended up transposing the data horizontally as such:

 

ID      CLASS    A    Visit    NextVisit   A2    Visit2   NextVisit2   A3    Visit3   NextVisit3   A4   Visit4   NextVisit4

A       001          1       1           1           1         1         0              0        0           0            0      0          0

B       002          1       0           0           1         1         1              1        0            0           1      1           0

C       001          1       1           1           0          0        0              0        0            0           0      0           0

D       003          1       1           0           1          0        0              0        0            0           0      0           0 

 

And from there I counted using conditions that go across the columns, for example class 001 has one ID (D) who has only one Visit and no Visit2, Visit3, Visit4. (If Visit=1 and Visit2=0 and Visit3=0 and Visit4=0)

It gets cumbersome however, when there are multiple index episodes (when A = 5 or more), and I need to count for example, between 2-3 valid visits.  It would require listing out the combinations: Visit=1 and Visit2=1 and Visit3=0 and Visit4=0 and Visit5=0,

Visit=1 and Visit2=0 and Visit3=1 and Visit4=0 and Visit5=0, etc).

 

Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

 

Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?

 

Yes.   That is if I understand your goal correctly.  You want frequencies, by class, of

  1. Nvisits per id
  2. Level of A for the first occurance of visit=1,   A_for_first_visit_1

To do that you need to make one observation per id, with the variables CLASS, NVISITS, and A_for_first_visit_1

 

data need (keep=id class nvisits A_for_first_visit_1);

  set have;

  by id;

  if last.id then nvisits=a;

  retain a_for_first_visit_1;

  if first.id then A_for_first_visit_1=.;

  if A_for_first_visit_1=. and visit=1 then A_for_first_visit_1=A;

  if last.id;

run;

 

proc tabulate data=need noseps;

  class class nvisits  A_for_first_visit_1;

  table    class all,nvisits all;

  table    class all,a_for_first_visit_1 all;

run;

 

 

Note this program assumes (1) that the data set is sorted by id, and (2) there are no "holes" in the sequence of A for each ID  (i.e. A=1, then 2, 3, ...).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
ballardw
Super User

If you are looking for something like a report and not a dataset this may get you started:

 

proc tabulate data=have;
   class class;
   class a;
   table class,
         a='Index count'*n=''
         /misstext='0'
   ;
run;

It helps to post data in the form of a datastep. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

appleorange
Obsidian | Level 7
I was not able to get to this project until now but I did use this and on double checking the numbers, it does not give counts of mutually exclusive conditions.
mkeintz
PROC Star

 

Basically, is there an approach that is able to count these conditions without having to transform to a horizontal format per ID, or if transforming is the best method, if there's a better non manual way to count IDs, categorized by class?

 

Yes.   That is if I understand your goal correctly.  You want frequencies, by class, of

  1. Nvisits per id
  2. Level of A for the first occurance of visit=1,   A_for_first_visit_1

To do that you need to make one observation per id, with the variables CLASS, NVISITS, and A_for_first_visit_1

 

data need (keep=id class nvisits A_for_first_visit_1);

  set have;

  by id;

  if last.id then nvisits=a;

  retain a_for_first_visit_1;

  if first.id then A_for_first_visit_1=.;

  if A_for_first_visit_1=. and visit=1 then A_for_first_visit_1=A;

  if last.id;

run;

 

proc tabulate data=need noseps;

  class class nvisits  A_for_first_visit_1;

  table    class all,nvisits all;

  table    class all,a_for_first_visit_1 all;

run;

 

 

Note this program assumes (1) that the data set is sorted by id, and (2) there are no "holes" in the sequence of A for each ID  (i.e. A=1, then 2, 3, ...).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
appleorange
Obsidian | Level 7

This definitely gets closer to the solution I was looking for, thank you.

Ksharp
Super User

data have;
input ID   $  Certain_Date  : mmddyy10.   A      Visit   NextVisit     Class $;
format certain_date mmddyy10.;
cards;
A      1/1/2015            1         1        1             001
A      3/4/2015            2         1        0             001
B      2/17/2015          1         0        0             002
B      3/01/2015          2         1        1             002
B      5/03/2015          3         0        0             002
B      9/14/2015          4         1        0             002
C      5/25/2015          1         1        1             001
D      2/22/2015          1         1        0             003
D      10/21/2015        2         0        0             003
;
run;

proc sql;
create table temp as
select class,id,count(*) as n
 from have
  group by class,id;
  
create table temp1 as
select class,n,count(*) as count
 from temp
  group by class,n;
quit;

proc transpose data=temp1 out=temp2 prefix=only_;
by class;
var count;
id n;
run;
proc stdize data=temp2 out=want missing=0 reponly;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 915 views
  • 0 likes
  • 4 in conversation