BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hjjijkkl
Pyrite | Level 9

From the data below I am trying to find IDs with 2 consecutive date below 80 visits. How Do I go about doing My analysis? Thanks

ID Date Visit 
R45 3/15/19 90
R45 4/15/19 100
R45 5/15/19 70
R45 6/15/19 60
R90 3/15/19 90
R90 4/15/19 100
R90 5/15/19 120
R90 6/15/19 100
R30 3/15/19 90
R30 4/15/19 87
R30 5/15/19 50
R30 6/15/19 40
     
     
     
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

So much of this thread has been spent just trying to clarify what the task statement actually should be:  "consecutive dates" vs "consecutive records",    two  less than 80, but not more than two.  I've had to read a lot of answers to respondent questions to finally understand what the objective is.

 

I think your best option is to create an intermediate dataset with a dummy variable (_UNDER80=1 if visit<80).  Then read that data set with a BY ID _UNDER80 NOTSORTED to count record-groups and output groups of size 2 if _UNDER80=1:

 

data have;
  input ID :$3.	Date :mmddyy.	Visit ;
  format date date9.;
datalines;
R45 3/15/19  90
R45 4/15/19 100
R45 5/15/19  70
R45 6/15/19  60
R90 3/15/19  90
R90 4/15/19 100
R90 5/15/19 120
R90 6/15/19 100
R30 3/15/19  90
R30 4/15/19  87
R30 5/15/19  50
R30 6/15/19  40
run;

data need;
  set have;
  _under80=(visit<80);
run;

data want (drop=_:);
  do _count=1 by 1 until (last._under80);
    set need;
    by id _under80 notsorted;
  end;
  do until (last._under80);
    set need;
    by id _under80 notsorted;
    if _count=2 and _under80=1 then output;
  end;
run;

 

Edited addition:

There is a way to do this in a single step - i.e. avoiding the DATA NEED step.  It requires the ability to compare whether the current visit and upcoming visit (_nxt_visit below) both have the same status when filtered as less-than 80.

 

data want (drop=_:);
  do _count=1 by 1 until (last.id or (sign(79.5-visit) ^= sign(79.5-_nxt_visit)));
    set have (keep=id);
	by id notsorted;
	merge have have(firstobs=2 keep=visit rename=(visit=_nxt_visit));
  end;

  do _i=1 to _count;
    set have;
	if _count=2 and visit<80 then output;
  end;
run;

So now, instead of

    DO _COUNT=1 BY 1 UNTIL (last._under80);

I have

    DO _COUNT=1 BY 1 UNTIL (LAST.ID or (SIGN(79.5-VISIT)^-SIGN(79.5-NXT_VISIT)));

 

The loops both stop at the same record.  Note I use SIGN(79.5-VISIT) for force the result to only be -1 or +1.  If I had used SIGN(80-VISIT), then the sign could also be zero (for visit=80).  I would have gotten the same output dataset, but the visit=80 records would be unnecessarily treated as their own group,   And this tactic works fine as long as VISIT values are integers.

 

The real "trick" here is using the self merge with a "FIRSTOBS=2" option as a way to compare current and upcoming VISIT values.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

20 REPLIES 20
ballardw
Super User

Please show what you expect the result to look like for your given data.

Is the result a list of just ID like R45 and R30? The records with the values below 80 that qualify? A flag added to your existing data?

 

You may have to define exactly what "consecutive" means as well. You show apparent once a month records. Would the records still be considered consecutive if the date is more than one month apart? 3 months? Some other upper limit?

 

Other results could respond to the requirement as well.

hjjijkkl
Pyrite | Level 9
For example, I was a table with records of values below 80 that qualify. By Consecutive meaning the visit date has to be below 80 back to back . If the visit goes < 80 on 5/15/19 and goes > 80 on 6/15/19 and goes < 80 7/15/19 then that is not considered
2 consecutive date for this analysis.
Reeza
Super User
Consecutive months I assume? Or two consecutive days? All the days are 15 so I can't imagine that's correct but then all months there are consecutive so then everything shown would meet your criteria somehow?
hjjijkkl
Pyrite | Level 9
the table I have here is actually an example I just created. but, in the really data some of the visit are like after 2 month, 5 month 8 month and so on..... But, Yes, I meant to say two consecutive days.
ballardw
Super User

I suggest again that you provide an example of what the output should look like.

And if you mean consecutive DAYS, then provide data that appears to be such, at least sometimes.

Include examples of ID where there are no records that qualify and some that do.

hjjijkkl
Pyrite | Level 9
Please take this table below instead.

ID Date Visit
R45 3/15/19 90
R45 5/1/19 100
R45 9/10/19 70
R45 10/8/19 60
R90 8/5/19 90
R90 4/15/19 100
R90 6/9/19 120
R90 1/5/20 100
R30 3/9/19 90
R30 4/5/19 87
R30 9/7/19 50
R30 8/18/20 40
hjjijkkl
Pyrite | Level 9
I want the output that look like this
R45 9/10/19 70
R45 10/8/19 60
R30 9/7/19 50
R30 8/18/20 40
continues ..............................
............................
Reeza
Super User
Under what logic? None of those dates are consecutive at all...
hjjijkkl
Pyrite | Level 9

Sorry if I am not making this clear, but what I am trying to create a table with the list of (ID, date, visit) where the visit was < 80 back to back. 

hjjijkkl
Pyrite | Level 9
Can someone be able to help me on this question?
Kurt_Bremser
Super User

You have a Maxim 42 issue.

Let me rephrase your question:

In a dataset that is grouped by ID and ordered (within the group) by DATE, you want to keep those observations where there are two or more consecutive values of VISIT smaller than 80. The dates can have gaps.

 

Is that correct?

hjjijkkl
Pyrite | Level 9
correct. but, only 2 consecutive values of VISIT smaller than 80.
hjjijkkl
Pyrite | Level 9
By Consecutive meaning the visit date has to be below 80 back to back only (2x) . For example, If the visit goes < 80 on 5/15/19 and goes > 80 on 6/15/19 and goes < 80 7/15/19 then that will not be considered 2 consecutive date for this analysis.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 20 replies
  • 3416 views
  • 4 likes
  • 6 in conversation