DATA Step, Macro, Functions and more

Remove lines from a file based on in-between period on another file

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Remove lines from a file based on in-between period on another file

I have the following two datasets.

I want to delete the lines from A with dates that falls within the in-out period in file B.

 

File A

SEQ	ID	date	var1
1	A12	03JAN04	242
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
5	B34	06MAY03	985
6	B34	13JUN03	198
7	B34	10MAY05	241
8	C56	09NOV10	155
9	C56	19OCT13	352

File B

ID	date_in	date_out
A12	01JAN04	04JAN04
A12	05FEB08	08FEB08
B34	03MAY03	06MAY03
B34	09MAY05	19MAY05
C56	12JUL12	18JUL12

For example, for ID=A12, his first obs falls within 01JAN04-04JAN04, thus I delete that obs, but keep the rest.

Essentially, I should get the following results

SEQ	ID	date	var1
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
6	B34	13JUN03	198
8	C56	09NOV10	155
9	C56	19OCT13	352

 

Both file A and B has multiple records per ID, and the between in-out period in file B is not alway constant.

 

I don't have a lot experience with data manipulation in SAS, I'm really clueless about where to start, eg. should I write a macro that loops through both file some how? or use SQL somehow?

I would greatly appreciate some advice on how to approach this problem.

Thanks!


Accepted Solutions
Solution
‎07-25-2016 10:40 AM
Respected Advisor
Posts: 4,173

Re: Remove lines from a file based on in-between period on another file

Here an approach using SQL 

 

/* option 1: Delete records if target is a database table */
proc sql;
  delete from A
  where exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

/* option 2: create a new table if target is a SAS file */
proc sql;
  create table want as
  select * from a
  where not exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

View solution in original post


All Replies
Super User
Posts: 5,513

Re: Remove lines from a file based on in-between period on another file

The first question to answer is what is in your DATE variables?  Are they character strings, or are they SAS dates (numeric, with a date7 format)?

 

PROC CONTENTS will reveal that.

Super User
Posts: 10,041

Re: Remove lines from a file based on in-between period on another file


data A;
infile cards expandtabs truncover;
input SEQ	ID $	date : date9.	var1;
format date date9.;
cards;
1	A12	03JAN04	242
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
5	B34	06MAY03	985
6	B34	13JUN03	198
7	B34	10MAY05	241
8	C56	09NOV10	155
9	C56	19OCT13	352
;
run;
data B;
infile cards expandtabs truncover;
input ID $	(date_in	date_out) ( : date9.);
format date_in	date_out date9.;
cards;
A12	01JAN04	04JAN04
A12	05FEB08	08FEB08
B34	03MAY03	06MAY03
B34	09MAY05	19MAY05
C56	12JUL12	18JUL12
;
run;
data key;
 set B;
 do date=date_in to date_out;
  output;
 end;
 keep id date;
run;
data want;
 if _n_=1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('id','date');
  h.definedone();
 end;
set A;
if h.check()=0 then delete;
run;

New Contributor
Posts: 3

Re: Remove lines from a file based on in-between period on another file

Thanks! this works nicely as well. Just learned a new function!
Solution
‎07-25-2016 10:40 AM
Respected Advisor
Posts: 4,173

Re: Remove lines from a file based on in-between period on another file

Here an approach using SQL 

 

/* option 1: Delete records if target is a database table */
proc sql;
  delete from A
  where exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

/* option 2: create a new table if target is a SAS file */
proc sql;
  create table want as
  select * from a
  where not exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;
New Contributor
Posts: 3

Re: Remove lines from a file based on in-between period on another file

Wow never knew sql has a "between" that solves everything easily! Thanks!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 332 views
  • 1 like
  • 4 in conversation