BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mjheever
Obsidian | Level 7

Hi everyone,

I have the following data sets:

Dataset A:

IDDATE

123

31JAN2014
12331JUL2014
12331AUG2014
22231MAR2014
22230APR2014
33330JUN2014
42128FEB2014
23531MAR2014
99131MAY2014
99131OCT2014
99131DEC2014

And Dataset B:

IDDATEMEMBERSHIP
12328FEB2014PURPLE
12330APR2014BRONZE
12331JUL2014SILVER
12331DEC2014GOLD
22228FEB2014PURPLE
22231MAR2014BRONZE
22231JUN2014SILVER
23528FEB2014SILVER
23531MAY2014GOLD

The result I want:

Dataset Want:

IDDATEDATE2MEMBERSHIP
12331JAN201428FEB2014PURPLE
12331JUL201431JUL2014SILVER
12331AUG201431JUL2014SILVER
22231MAR201431MAR2014BRONZE
22230APR201431MAR2014BRONZE
33330JUN2014.
42128FEB2014.
23531MAR201428FEB2014GOLD
99131MAY2014.
99131OCT2014.
91131DEC2014.

Essentially I want to check a customers status as close as possible to dates from table A.

I have done the following code:

proc sql;

    create table finaldsn as

select distinct a.ID,

a.DATE,

b.DATE2,

b.MEMBERSHIP

, a.DATE-b.DATE2 as days

, abs(calculated days) as absdays

, min(calculated absdays)as close

from DatasetA as a

full join

DatasetB as b

on a.ID=b.ID

where a.DATE ne . and b.DATE2 ne .

and b.ID in (select distinct ID

from DatasetA)

group by a.ID, a.Date

having calculated absdays=calculated close

order by ID, Date, close

;

quit;

data want;

     set finaldsn;

by ID Date;

if first.ID;

run;

But this only gives the first entry of each unique ID regardless of the date associated with it.

Anyone got ideas how to work around it ?

Thank you Smiley Happy.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data a;
input id date :date9.;
format date date9.;
cards;
123 31JAN2014
123     31JUL2014
123     31AUG2014
222     31MAR2014
222     30APR2014
333     30JUN2014
421     28FEB2014
235     31MAR2014
991     31MAY2014
991     31OCT2014
991     31DEC2014
;
run;
data b;
input id date : date9. membership $;
format date date9.;
cards;
123     28FEB2014     PURPLE
123     30APR2014     BRONZE
123     31JUL2014     SILVER
123     31DEC2014     GOLD
222     28FEB2014     PURPLE
222     31MAR2014     BRONZE
222     30JUN2014     SILVER
235     28FEB2014     SILVER
235     31MAY2014     GOLD
;
run;
proc sql;
 create table want(drop=dif) as
  select a.*,b.date as date2,b.membership,abs(a.date-b.date) as dif
   from a left join b
    on a.id=b.id 
      group by a.id,a.date
       having calculated dif=min(calculated dif);
quit;

Xia Keshan

View solution in original post

4 REPLIES 4
mjheever
Obsidian | Level 7

One solution I'm now working on is to make a new a unique field to identify the observation:

-> NewID = ID + DATE (for example 12331JAN2014)

-> And then apply the first. statement to the new identifier, NewID.

Although I don't think this is a very efficient way regarding the problem, especially if you are working with a large data set ?

Kurt_Bremser
Super User

I'd try to start with expanding the first dataset:

%let offset_max=100;* or whatever suits your needs;

data int1;

set a;

do offset = 1 to &offset_max;

  target_date = date - offset;

  output;

  target_date = date + offset;

  output;

end;

run;

Now sort that dataset by id, target_date and offset

If you now merge on id and date = target_date with dataset B and keep the smallest offset, you should get your intended result.

jwillis
Quartz | Level 8

I created dataa and datab datasets changing 31JUN2014 to 30JUN2014.  The only problem I had was with ID 235.  Your 'want' says 235 should be gold.  My results say it should be SILVER. I arbitrarily gave days the value of 1000000 when it would calculate as missing.

proc sql;

drop table testit;

create table testit as

select distinct  a. ID,

a.date format=date9.,

case

when b.id  = ' ' then .

else b.date2 end as date2 format=date9.,

case

when b.id = ' ' then ' '

else b.membership end as MEMBERSHIP,

case

when b.id = ' ' then 1000000

else a.DATE-b.DATE2 end as days

, abs(calculated days) as absdays

, min(calculated absdays)as close

from DataA as a full join

DataB as b

on  a.ID=b.ID 

group by a.ID, a.date

order by id, close, date, absdays, date2

;

quit;

proc sort data=testit;

by id date close absdays;

run;

data testit3;

set testit;

by id date close absdays;

if first.date then output;

else delete;

keep id date date2 membership;

run;

proc print data=testit3;run;

  ID      DATE              date2    MEMBERSHIP    days    absdays    close

   235    31MAR2014    28FEB2014      SILVER        31       31        31

   235    31MAR2014    31MAY2014      GOLD         -61       61        31

                              Obs    ID          DATE        date2    MEMBERSHIP

                                1    123    31JAN2014    28FEB2014      PURPLE

                                2    123    31JUL2014    31JUL2014      SILVER

                                3    123    31AUG2014    31JUL2014      SILVER

                                4    222    31MAR2014    31MAR2014      BRONZE

                                5    222    30APR2014    31MAR2014      BRONZE

                                6    235    31MAR2014    28FEB2014      SILVER

                                7    333    30JUN2014            .

                                8    421    28FEB2014            .

                                9    991    31MAY2014            .

                               10    991    31OCT2014            .

                               11    991    31DEC2014            .

Ksharp
Super User
data a;
input id date :date9.;
format date date9.;
cards;
123 31JAN2014
123     31JUL2014
123     31AUG2014
222     31MAR2014
222     30APR2014
333     30JUN2014
421     28FEB2014
235     31MAR2014
991     31MAY2014
991     31OCT2014
991     31DEC2014
;
run;
data b;
input id date : date9. membership $;
format date date9.;
cards;
123     28FEB2014     PURPLE
123     30APR2014     BRONZE
123     31JUL2014     SILVER
123     31DEC2014     GOLD
222     28FEB2014     PURPLE
222     31MAR2014     BRONZE
222     30JUN2014     SILVER
235     28FEB2014     SILVER
235     31MAY2014     GOLD
;
run;
proc sql;
 create table want(drop=dif) as
  select a.*,b.date as date2,b.membership,abs(a.date-b.date) as dif
   from a left join b
    on a.id=b.id 
      group by a.id,a.date
       having calculated dif=min(calculated dif);
quit;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1456 views
  • 3 likes
  • 4 in conversation