BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
chlorium
Calcite | Level 5

I have the following two datasets.

I want to delete the lines from A with dates that falls within the in-out period in file B.

 

File A

SEQ	ID	date	var1
1	A12	03JAN04	242
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
5	B34	06MAY03	985
6	B34	13JUN03	198
7	B34	10MAY05	241
8	C56	09NOV10	155
9	C56	19OCT13	352

File B

ID	date_in	date_out
A12	01JAN04	04JAN04
A12	05FEB08	08FEB08
B34	03MAY03	06MAY03
B34	09MAY05	19MAY05
C56	12JUL12	18JUL12

For example, for ID=A12, his first obs falls within 01JAN04-04JAN04, thus I delete that obs, but keep the rest.

Essentially, I should get the following results

SEQ	ID	date	var1
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
6	B34	13JUN03	198
8	C56	09NOV10	155
9	C56	19OCT13	352

 

Both file A and B has multiple records per ID, and the between in-out period in file B is not alway constant.

 

I don't have a lot experience with data manipulation in SAS, I'm really clueless about where to start, eg. should I write a macro that loops through both file some how? or use SQL somehow?

I would greatly appreciate some advice on how to approach this problem.

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Here an approach using SQL 

 

/* option 1: Delete records if target is a database table */
proc sql;
  delete from A
  where exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

/* option 2: create a new table if target is a SAS file */
proc sql;
  create table want as
  select * from a
  where not exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

View solution in original post

5 REPLIES 5
Astounding
PROC Star

The first question to answer is what is in your DATE variables?  Are they character strings, or are they SAS dates (numeric, with a date7 format)?

 

PROC CONTENTS will reveal that.

Ksharp
Super User

data A;
infile cards expandtabs truncover;
input SEQ	ID $	date : date9.	var1;
format date date9.;
cards;
1	A12	03JAN04	242
2	A12	01FEB06	356
3	A12	06JAN08	325
4	A12	28DEC09	123
5	B34	06MAY03	985
6	B34	13JUN03	198
7	B34	10MAY05	241
8	C56	09NOV10	155
9	C56	19OCT13	352
;
run;
data B;
infile cards expandtabs truncover;
input ID $	(date_in	date_out) ( : date9.);
format date_in	date_out date9.;
cards;
A12	01JAN04	04JAN04
A12	05FEB08	08FEB08
B34	03MAY03	06MAY03
B34	09MAY05	19MAY05
C56	12JUL12	18JUL12
;
run;
data key;
 set B;
 do date=date_in to date_out;
  output;
 end;
 keep id date;
run;
data want;
 if _n_=1 then do;
  if 0 then set key;
  declare hash h(dataset:'key');
  h.definekey('id','date');
  h.definedone();
 end;
set A;
if h.check()=0 then delete;
run;

chlorium
Calcite | Level 5
Thanks! this works nicely as well. Just learned a new function!
Patrick
Opal | Level 21

Here an approach using SQL 

 

/* option 1: Delete records if target is a database table */
proc sql;
  delete from A
  where exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;

/* option 2: create a new table if target is a SAS file */
proc sql;
  create table want as
  select * from a
  where not exists
    (select * from B where a.id=b.id and a.date between b.date_in and b.date_out)
  ;
quit;
chlorium
Calcite | Level 5
Wow never knew sql has a "between" that solves everything easily! Thanks!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2302 views
  • 1 like
  • 4 in conversation