BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amber7391
Fluorite | Level 6

Hey guys, I am new to SAS.

I have a problem which looks like the following.

There are two types of visits, x and y.

The data has already sorted based on id and visit date.

Now I want to exclude all the x type visits for each id number.

Also, if the visit date of y type falls in the range of x's check in and check out date, it also needs to be excluded.

Moreover, if there is an x type visit in 3 days after y type visit, both of them need to be excluded.

Can someone please help???

 

This is how the raw data look like:

idvisit datecheck in check outtype
11/29/191/23/191/29/19x
11/26/19  y
22/1/19  y
22/2/192/2/192/6/19x
33/3/193/5/193/20/19x
33/5/193/5/193/20/19y
44/23/194/24/194/27/19x
44/24/194/24/194/27/19y
48/8/19  y
59/1/19  y
59/4/199/4/199/6/19x
510/10/1910/10/1910/21/19y

 

This is how the result should be:

idvisit datecheck in check outtype
48/8/19  y
510/10/1910/10/1910/21/19y
1 ACCEPTED SOLUTION

Accepted Solutions
hashman
Ammonite | Level 13

@amber7391:

Read each BY group twice:

  1. On the first read, skip the Y-records and determine the boundaries for the Y-conditions based on the X-values. 
  2. On the second read, skip the X-records and filter the Y-records based on the conditions discovered in #1.

For example:

data have ;                                                              
  input id (visit_date check_in check_out) (:mmddyy8.) type :$1. ;       
  format v: c: yymmdd10. ;                                               
  cards ;                                                                
1  1/29/19  1/23/19   1/29/19  x                                         
1  1/26/19  .         .        y                                         
2  2/01/19  .         .        y                                         
2  2/02/19  2/02/19   2/06/19  x                                         
3  3/03/19  3/05/19   3/20/19  x                                         
3  3/05/19  3/05/19   3/20/19  y                                         
4  4/23/19  4/24/19   4/27/19  x                                         
4  4/24/19  4/24/19   4/27/19  y                                         
4  8/08/19  .         .        y                                         
5  9/01/19  .         .        y                                         
5  9/04/19  9/04/19   9/06/19  x                                         
5 10/10/19 10/10/19  10/21/19  y                                         
;                                                                        
run ;                                                                    
                                                                         
data want (drop = _:) ;                                                  
  _xi_min = constant ("big") ;                                           
  _xv_min = constant ("big") ;                                           
  do _n_ = 1 by 1 until (last.id) ;                                      
    set have ;                                                           
    by id ;                                                              
    if type = "y" then continue ;                                        
    _xi_min = _xi_min min check_in ;                                     
    _xo_max = _xo_max max check_out ;                                    
    _xv_min = _xv_min min visit_date ;                                   
    _xv_max = _xv_max max visit_date ;                                   
  end ;                                                                  
  do _n_ = 1 to _n_ ;                                                    
    set have ;                                                           
    if type = "x" then continue ;                                        
    if _xi_min <= visit_date <= _xo_max then continue ;                  
    if visit_date <= _xv_min <= _xv_max <= visit_date + 3 then continue ;
    output ;                                                             
  end ;                                                                  
run ;                                                                    

Kind regards

Paul D.

View solution in original post

14 REPLIES 14
ballardw
Super User

First thing is start with a SAS data set with date values. From you post it looks like you are copying from Excel and so we can't be sure what some of those dates might actually be: 23-Jan for example. Or whether you may have a mix of character and actual date values in Excel.

 

You need to show what you expect your output to actually look like when you are done.

Since you have moderately complex rules involving intervals and order this is likely not going to get a simple to understand single step solution.

amber7391
Fluorite | Level 6
Hi ballardw, thanks for pointing that out. I have corrected the format and added the result I would like to have. If you dont mind you can take a look. I really appreciate it!
hashman
Ammonite | Level 13

@ChrisNZ:

The X-record is killed unconditionally.

The Y-record is killed since the visit date of the X-record (2/02/19) is within 3 days after the visit date of the Y-record (2/01/19).

Hence, the whole ID=2 group is a goner.

 

Kind regards

Paul D.
 

ChrisNZ
Tourmaline | Level 20

@hashman Ha! Of course! The reversed American dates got me again!

hashman
Ammonite | Level 13

@ChrisNZ:

Ha's on me: After almost 30 revolutions round the Sun, I still have to make a concerted effort to remind myself of that (the European dd/mm/yy is better but still makes just as much sense as writing the number 753 as 357). That's why I abhor when sample data dates are presented as mm/dd/yy or dd-Mon-yy (they make no sense when sorted) and always both present them and format them as yymmdd10. Hierarchical numbers must come higher order to lower order, left to right; only then they're comprehended properly when ordered.

 

Kind regards

Paul D.     

ChrisNZ
Tourmaline | Level 20

@hashman  Totally with you on compulsory use of yymmdd under the penalty of imprisonment with hard labour! 🙂

hashman
Ammonite | Level 13

@ChrisNZ:

Those using mmddyy and such bring hard labor on themselves, getting self-imprisoned by having to sift through unsorted mess.

hashman
Ammonite | Level 13

@amber7391:

Read each BY group twice:

  1. On the first read, skip the Y-records and determine the boundaries for the Y-conditions based on the X-values. 
  2. On the second read, skip the X-records and filter the Y-records based on the conditions discovered in #1.

For example:

data have ;                                                              
  input id (visit_date check_in check_out) (:mmddyy8.) type :$1. ;       
  format v: c: yymmdd10. ;                                               
  cards ;                                                                
1  1/29/19  1/23/19   1/29/19  x                                         
1  1/26/19  .         .        y                                         
2  2/01/19  .         .        y                                         
2  2/02/19  2/02/19   2/06/19  x                                         
3  3/03/19  3/05/19   3/20/19  x                                         
3  3/05/19  3/05/19   3/20/19  y                                         
4  4/23/19  4/24/19   4/27/19  x                                         
4  4/24/19  4/24/19   4/27/19  y                                         
4  8/08/19  .         .        y                                         
5  9/01/19  .         .        y                                         
5  9/04/19  9/04/19   9/06/19  x                                         
5 10/10/19 10/10/19  10/21/19  y                                         
;                                                                        
run ;                                                                    
                                                                         
data want (drop = _:) ;                                                  
  _xi_min = constant ("big") ;                                           
  _xv_min = constant ("big") ;                                           
  do _n_ = 1 by 1 until (last.id) ;                                      
    set have ;                                                           
    by id ;                                                              
    if type = "y" then continue ;                                        
    _xi_min = _xi_min min check_in ;                                     
    _xo_max = _xo_max max check_out ;                                    
    _xv_min = _xv_min min visit_date ;                                   
    _xv_max = _xv_max max visit_date ;                                   
  end ;                                                                  
  do _n_ = 1 to _n_ ;                                                    
    set have ;                                                           
    if type = "x" then continue ;                                        
    if _xi_min <= visit_date <= _xo_max then continue ;                  
    if visit_date <= _xv_min <= _xv_max <= visit_date + 3 then continue ;
    output ;                                                             
  end ;                                                                  
run ;                                                                    

Kind regards

Paul D.

amber7391
Fluorite | Level 6
Thank you soooo much for help me out hashman! I really appreciate it. It fixed my problem perfectly!
s_lassen
Meteorite | Level 14

These kind of problems are most easily solved in SQL:

proc sql;
  create table want as select * from have y
    where type='y' 
      and not exists(select * from have x 
                      where type='x' 
			and x.id=y.id
			and (y.visit_date between x.check_in and x.check_out
			     or y.visit_date between x.visit_date-3 and x.visit_date));
quit;       
amber7391
Fluorite | Level 6
s_lassen thank you so much!
Ksharp
Super User

data have ;                                                              
  input id (visit check_in check_out) (:mmddyy8.) type :$1. ;       
  format v: c: yymmdd10. ;                                               
  cards ;                                                                
1  1/29/19  1/23/19   1/29/19  x                                         
1  1/26/19  .         .        y                                         
2  2/01/19  .         .        y                                         
2  2/02/19  2/02/19   2/06/19  x                                         
3  3/03/19  3/05/19   3/20/19  x                                         
3  3/05/19  3/05/19   3/20/19  y                                         
4  4/23/19  4/24/19   4/27/19  x                                         
4  4/24/19  4/24/19   4/27/19  y                                         
4  8/08/19  .         .        y                                         
5  9/01/19  .         .        y                                         
5  9/04/19  9/04/19   9/06/19  x                                         
5 10/10/19 10/10/19  10/21/19  y                                         
;                                                                        
run ;

proc sql;
create table want as
 select distinct a.*
  from (select * from have where type='y') as a ,
       (select * from have where type='x') as b
where a.id=b.id and a.visit not between b.check_in and b.check_out
and b.visit not between a.visit and a.visit+3;
quit; 


amber7391
Fluorite | Level 6
Ksharp thank you so much!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 3021 views
  • 8 likes
  • 6 in conversation