BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sir_Highbury
Quartz | Level 8

Dear experts,

 

I should merge two data sets on two variables but taking the observation on the left part of the join only once. The output below generated should have for the second and fourth observation a value=""

 

DATA data1;
INPUT code $2. location $3. location2 $4.;
DATALINES;
61 UK ABC
45 CH DEF
03 CH DEF
11 JP GHI
14 JP GHI
;


DATA data2;
INPUT value $3. location $3. location2 $4.;
DATALINES;
100 UK ABC
200 CH DEF
300 JP GHI
;

proc sql; create table merge as select t0.*, t1.* from data1 t0 left outer join data2 t1
on (t1.location=t0.location and t1.location2=t0.location2) ;quit;

 

Thanks a lo in advance. BRs, SH

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Try the data step method:

proc sort data=data1;
by location location2;
run;

proc sort data=data2;
by location location2;
run;

data merge2;
merge
  data1 (in=a)
  data2 (in=b)
;
by location location2;
if a;
if first.location2;
run;

You might want to consider additional sort criteria for data1 to put the desired code first.

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

That'snot how SQL works. You requirement is row (observation) oriented, while SQL is column oriented.

SQL first performs a Cartesian join (in theory) and the filters using the on (or where criteria when used).

You need to use a data step to solve this.

Perhaps you need a better data model so you can describe which is the first observation in the "join group" (in case of resorting for instance), and why.

Data never sleeps
Sir_Highbury
Quartz | Level 8

I just showed that I was not able to do it (using sql) and I asked how to get the output I want from the input I have. Where did I write to do it necessarily in sql? The only implicit constrain is using SAS. Pheraphs what I was doing is not related to any data model? Please try to answer accordingly to the question, thanks anyway for your effort.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Thats not a logical join though.  There is no way of knowing, from the data, that any row is "first".  This is information you need to put in before joining, and example given below.  Note alo that the use of * causes warnings as variables appear in multiple tables - its not a good idea to use this syntax.

data data1;
  input code $2. location $3. location2 $4. first;
datalines;
61 UK ABC 1
45 CH DEF 1
03 CH DEF 0
11 JP GHI 1
14 JP GHI 0
;
run;

data data2;
  input value $3. location $3. location2 $4. first;
datalines;
100 UK ABC 1
200 CH DEF 1
300 JP GHI 1
; 
run;

proc sql; 
  create table MERGE as 
  select  T0.*, 
          T1.* 
  from    DATA1 T0 
  left join DATA2 T1
  on      T1.LOCATION=T0.LOCATION 
  and     T1.LOCATION2=T0.LOCATION2
  and     T1.FIRST=T0.FIRST;
quit;
Kurt_Bremser
Super User

Try the data step method:

proc sort data=data1;
by location location2;
run;

proc sort data=data2;
by location location2;
run;

data merge2;
merge
  data1 (in=a)
  data2 (in=b)
;
by location location2;
if a;
if first.location2;
run;

You might want to consider additional sort criteria for data1 to put the desired code first.

Astounding
PROC Star

The DATA step is a workable approach here.  SQL does not guarantee any "order" to the incoming records.

 

Here are a couple of variations to consider on the DATA step approach.

 

Instead of sorting/merging by LOCATION LOCATION2, you could sort/merge by LOCATION2 LOCATION.  In the sample data you have provided, that would preserve the current order of the observations (and could conceivably eliminate the need for sorting).

 

For the final statement, you could consider this variation:

 

if first.location2=0 then value=' ';

 

That would keep all records, but reset VALUE to a blank value.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 15041 views
  • 6 likes
  • 5 in conversation