data merge

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

data merge

I have sas data

data1

IDIN
1A
2B
3C
4C

data2

INEX
AD
AE
AF
BD
BE
CE
CF

I want data3

IDINEX
1AD
1AE
1AF
2BD
2BE
3CE
3CF
4CE
4CF

I sort file 1, file 2 by in, then merge data 1 data 2 or merge data2 data1, results are different. What's happening? Any better way to get data 3? Thank you!

data temp;

merge data 1 data 2;

by in;

run;

data temp;

merge data 2 data 1;

by in;

run;


Accepted Solutions
Solution
‎08-25-2013 02:41 AM
Super Contributor
Posts: 307

Re: data merge

Posted in reply to SASUserWhoNeedHelp

Try this (untested):

proc sql;

     create table want as

     select t1.*, t2.EX

     from data1 t1

          , data2 t2

     where t1.IN = t2.IN

     order by ID, IN, EX

;

quit;

View solution in original post


All Replies
Occasional Contributor
Posts: 6

Re: data merge

Posted in reply to SASUserWhoNeedHelp

also, looks like i can not get data 3 by merging data1, data2 by variable in, no matter i merge data1 data 2, or merge data 2 data1.

But I don't know how to expand A to D,E,F, B to D,E, C to E, F ....in data 1.....

Solution
‎08-25-2013 02:41 AM
Super Contributor
Posts: 307

Re: data merge

Posted in reply to SASUserWhoNeedHelp

Try this (untested):

proc sql;

     create table want as

     select t1.*, t2.EX

     from data1 t1

          , data2 t2

     where t1.IN = t2.IN

     order by ID, IN, EX

;

quit;

New Contributor
Posts: 2

Re: data merge

Posted in reply to SASUserWhoNeedHelp

can you try using this piece of code.

i remember learning about such a situation in sas advanced prep guide:

data _3;

set _1;

do i =1 to num;

set _2(rename=(in = in1)) nobs=num point=i;

if in = in1 then output;

end;

run;

proc print data = _last_;

run;

New Contributor
Posts: 2

Re: data merge

Posted in reply to sachu69in

where _1 and _2 are the input datasets. you can try changing the order of the datasets to see if it works any which ways...hope i am making sense.

Respected Advisor
Posts: 3,156

Re: data merge

Posted in reply to SASUserWhoNeedHelp

@Fugue's approach will be the one I would choose to use as well, cause we can take advantage of Proc SQL's native engine to do this Cartesian product kind of join by groups. However, if you have the curious mind (like most of us on this forum) on how to achieve the same results using Data step, then Hash() will most likely come into play.

data data1;

input ID IN $;

cards;

1 A

2 B

3 C

4 C

;

data data2;

input IN$ EX$;

cards;

A D

A E

A F

B D

B E

C E

C F

;

data data3;

  if _n_=1 then do;

     if 0 then set data1 data2;

  declare hash h(dataset:'data2', multidata:'y');

  h.definekey('in');

  h.definedata('ex');

  h.definedone();

  end;

  set data1;

    _n_=h.find();

    do _n_=0 by 0 while (_n_=0);

   output;

   _n_=h.find_next();

  end;

run;

Without involving Hash(), Data step + Index might have a shot as well, with a little more overhead I guess. @sachu69in's method will work , but with too much overhead.

HTH,

Haikuo

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 275 views
  • 1 like
  • 4 in conversation