BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody,

I have table A and B which have name and date variables and I want to remove observations in table A with the same name and date in table B. In addition, for the same name and date between table A and B, I want to delete the next date after it in table A.

 

The table A is:

 

 

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A, 7-May-08
A, 11-Jul-08
A, 3-Jan-09
A, 4-Jan-09
A, 8-Jul-09
A, 24-Jul-09
A, 8-Dec-09
B, 7-May-08
B, 10-May-08
B, 17-May-08
B, 24-May-08
B, 1-Jun-08
B, 18-Jun-08
B, 9-May-09
C, 7-Oct-09
C, 17-Oct-09
C, 25-Oct-09
C, 18-Nov-09
C, 4-Dec-09
C, 19-Dec-09
C, 9-May-10
C, 9-May-10
C, 10-May-10
;;;;
run;

 

 

And the table B is:

 

data WORK.TABLEB;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A,11-Jul-08
A,3-Jan-09
A,24-Jul-09
B,7-May-08
B,17-May-08
B,18-Jun-08
B,9-Jul-09
C,17-Oct-09
C,4-Dec-09
C,19-Dec-09
;;;;
run;

 

So the result is:

 

+------+-----------+
| name |   date    |
+------+-----------+
| A    | 7-May-08  |
| A    | 8-Jul-09  |
| B    | 1-Jun-08  |
| C    | 7-Oct-09  |
| C    | 18-Nov-09 |
| C    | 10-May-10 |
+------+-----------+

How can I do that?

Thanks in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

That is to say you don't like to use PROC SORT ?

 

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A, 7-May-08
A, 11-Jul-08
A, 3-Jan-09
A, 4-Jan-09
A, 8-Jul-09
A, 24-Jul-09
A, 8-Dec-09
B, 7-May-08
B, 10-May-08
B, 17-May-08
B, 24-May-08
B, 1-Jun-08
B, 18-Jun-08
B, 9-May-09
C, 7-Oct-09
C, 17-Oct-09
C, 25-Oct-09
C, 18-Nov-09
C, 4-Dec-09
C, 19-Dec-09
C, 9-May-10
C, 9-May-10
C, 10-May-10
;;;;
run;

data WORK.TABLEB;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A,11-Jul-08
A,3-Jan-09
A,24-Jul-09
B,7-May-08
B,17-May-08
B,18-Jun-08
B,9-Jul-09
C,17-Oct-09
C,4-Dec-09
C,19-Dec-09
;;;;
run;
data temp;
 set tableb(in=inb) tablea;
 by name date;
 group+first.date;
 _inb=inb;
run;
data key;
 set temp(where=(_inb=1));
 output;
 group=group+1;
 output;
 keep name group;
run;
proc sql;
select *
 from temp 
  where catx(' ',name,group) not in
   (select catx(' ',name,group) from key);
quit;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

proc sql;
  create table WANT as 
  select A.*
  from   TABLEA A
  left join TABLEB B
  on       A.NAME=B.NAME
  and     (A.DATE=B.DATE or A.DATE=B.DATE+1)
  where  B.NAME="";
quit;

But I am afraid your required output doesn't seem to work, why does 08Dec for A not come out, it should, there is no 1 day around it in B?

PeterClemmensen
Tourmaline | Level 20

"In addition, for the same name and date between table A and B, I want to delete the next date after it in table A." @aminkarimid, do you mean the next date or the next observation?

aminkarimid
Lapis Lazuli | Level 10
Thanks @PeterClemmensen
I mean name and date observations for next date in table A.
Ksharp
Super User
data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A, 7-May-08
A, 11-Jul-08
A, 3-Jan-09
A, 4-Jan-09
A, 8-Jul-09
A, 24-Jul-09
A, 8-Dec-09
B, 7-May-08
B, 10-May-08
B, 17-May-08
B, 24-May-08
B, 1-Jun-08
B, 18-Jun-08
B, 9-May-09
C, 7-Oct-09
C, 17-Oct-09
C, 25-Oct-09
C, 18-Nov-09
C, 4-Dec-09
C, 19-Dec-09
C, 9-May-10
C, 9-May-10
C, 10-May-10
;;;;
run;

data WORK.TABLEB;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A,11-Jul-08
A,3-Jan-09
A,24-Jul-09
B,7-May-08
B,17-May-08
B,18-Jun-08
B,9-Jul-09
C,17-Oct-09
C,4-Dec-09
C,19-Dec-09
;;;;
run;
proc sort data=tablea out=tablea1 nodupkey;
by name date;
run;
proc sort data=tableb out=tableb1 nodupkey;
by name date;
run;
data want;
 merge  tablea1 tableb1(in=inb) ;
 by name date;
 if inb or (inb=0 and lag(inb)=1 and not first.name) then delete;;
run;
aminkarimid
Lapis Lazuli | Level 10

Thanks @Ksharp;

Now, I want to go one step further. If table A has another variable like time which is shown below:

 

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time";
datalines4;
A, 7-May-08, 11:12:41
A, 11-Jul-08, 11:23:41
A, 3-Jan-09, 11:31:41
A, 4-Jan-09, 11:32:41
A, 8-Jul-09, 11:32:41
A, 24-Jul-09, 11:32:41
A, 8-Dec-09, 12:32:41
B, 7-May-08, 11:31:41
B, 10-May-08, 11:32:41
B, 17-May-08, 11:33:41
B, 24-May-08, 11:34:41
B, 1-Jun-08, 11:35:41
B, 18-Jun-08, 11:36:41
B, 9-May-09, 11:37:41
C, 7-Oct-09, 11:21:41
C, 17-Oct-09, 11:22:41
C, 25-Oct-09, 11:32:41
C, 18-Nov-09, 11:33:41
C, 4-Dec-09, 11:12:41
C, 19-Dec-09, 10:22:41
C, 9-May-10, 11:42:41
C, 9-May-10, 11:12:41
C, 10-May-10, 12:52:41
;;;;
run;

 

Using the nodupkey option is not a good solution to delete same observations. because there are maybe another variables like price, turnover, etc or the duplicate observations is important and must be on a table. So your last part of your code needs to recognize observations with the same date which should be deleted and then delete them.

Ksharp
Super User

That is to say you don't like to use PROC SORT ?

 

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A, 7-May-08
A, 11-Jul-08
A, 3-Jan-09
A, 4-Jan-09
A, 8-Jul-09
A, 24-Jul-09
A, 8-Dec-09
B, 7-May-08
B, 10-May-08
B, 17-May-08
B, 24-May-08
B, 1-Jun-08
B, 18-Jun-08
B, 9-May-09
C, 7-Oct-09
C, 17-Oct-09
C, 25-Oct-09
C, 18-Nov-09
C, 4-Dec-09
C, 19-Dec-09
C, 9-May-10
C, 9-May-10
C, 10-May-10
;;;;
run;

data WORK.TABLEB;
infile datalines dsd truncover;
input name:$3. date:DATE9.;
format date DATE9.;
label name="name" date="date";
datalines4;
A,11-Jul-08
A,3-Jan-09
A,24-Jul-09
B,7-May-08
B,17-May-08
B,18-Jun-08
B,9-Jul-09
C,17-Oct-09
C,4-Dec-09
C,19-Dec-09
;;;;
run;
data temp;
 set tableb(in=inb) tablea;
 by name date;
 group+first.date;
 _inb=inb;
run;
data key;
 set temp(where=(_inb=1));
 output;
 group=group+1;
 output;
 keep name group;
run;
proc sql;
select *
 from temp 
  where catx(' ',name,group) not in
   (select catx(' ',name,group) from key);
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 662 views
  • 2 likes
  • 4 in conversation