Help using Base SAS procedures

how to do left join in sas dataset

Accepted Solution Solved
Reply
Contributor
Posts: 20
Accepted Solution

how to do left join in sas dataset

Hi,

I have a quick question I have the following code

proc sql;

     create table want as

     select label, value

     from dataSet1 left join dataSet2

    on start<=value<=end

    group by label

    order start;

run;

The thing is this code is not working if I want to replicate this sql statement using a data step how would I do this

it should be a left join

data want

set dataSet1(in=a) dataSet2(in=b);

//not sure what do here

run;


Accepted Solutions
Solution
‎09-29-2017 03:01 PM
N/A
Posts: 1

Re: how to do left join in sas dataset

Posted in reply to johnhuang12

i think you can do left join in data step using Merge.Lets say I have two datasets Table1 and Table2 which have has common variable ID.We first need to sort the two datasets by ID.After Sorting the twosets we can code as :

Proc Sql;

Create table NEW as

Select T1.*,T2.* from

Table1 as T1 left join Table2 as T2

on T1.id=T2.id;

Quit;

This will give matching rows with the non-matching rows from left table in the new dataset NEW.

Similarly in data step:

Data NEW;

Merge Table1(in=T1) Table2(in=T2);

If T1;

by ID;

run;

I suppose this works.Apologies if it does not help out.

View solution in original post


All Replies
Super User
Posts: 5,424

Re: how to do left join in sas dataset

Posted in reply to johnhuang12

1. Why do you want to use a data step?

2. Search these forums and support.sas.com - this is one of the most frequent discussions in the history of SAS...

Data never sleeps
Super Contributor
Posts: 339

Re: how to do left join in sas dataset

Posted in reply to johnhuang12

You would first need to check your syntax and error log if a piece of code does not work. If it doesn't produce expected results, then it might be a functionality does not oblige to your requirements.

proc sql;

     create table want as

     select label, value

     from dataSet1 left join dataSet2

    on start<=value<=end

    group by label

    order by start;

run;

quit;

There is also typically a good practice when using SQL to use the dot notation to prefix variable names as per their source i.e.

proc sql;

     create table want as

     select a.label, a.value

     from dataSet1 as a left join dataSet2 as b

    on b.start<=a.value<=b.end

    group by a.label

    order by b.start;

quit;

Additionally, I don't think (but I might be wrong here) that SQL standards use the group by statement in a similar fashion to SAS data step with a merge statement. That is, to achieve the equivalent by statement, you would normally want to have something like

proc sql;

     create table want as

     select a.label, a.value

     from dataSet1 as a left join dataSet2 as b

    on  a.label=b.label and

          b.start<=a.value<=b.end

    group by a.label

    order by b.start;

quit;

Furthermore there is an ambiguity with your order by statement (At least based on my best guess of what you were trying to achieve) since the LEFT join retains records from dataset1 which did not match and thus have no start and end values tied to the record.

The SQL group by statement is used to define what summary functions "sumarize by". The general idea is that all variables in the group by statement are found in the select statement and all other variables in the select statement use sumary functions so as to generate a single record per group by unique key. Otherwise, you generate a cartesian product.

Beyond that, I must agree with LinusH. Data step is not appropriate for all merges and it is why proc sql exists and is built in sas. It might be viable depending on your data but likely not as it is even less fuzz match friendly than proc sql (the fuzz occur with start<=value<=end condition). Other data step alternatives involving hash object are more flexible in terms of what can be achieved (when compared with a merge statement) but it is best to be familiar with proc sql prior to getting into the hash object as the object has memory limitations.

Hope this helps

Vincent

Solution
‎09-29-2017 03:01 PM
N/A
Posts: 1

Re: how to do left join in sas dataset

Posted in reply to johnhuang12

i think you can do left join in data step using Merge.Lets say I have two datasets Table1 and Table2 which have has common variable ID.We first need to sort the two datasets by ID.After Sorting the twosets we can code as :

Proc Sql;

Create table NEW as

Select T1.*,T2.* from

Table1 as T1 left join Table2 as T2

on T1.id=T2.id;

Quit;

This will give matching rows with the non-matching rows from left table in the new dataset NEW.

Similarly in data step:

Data NEW;

Merge Table1(in=T1) Table2(in=T2);

If T1;

by ID;

run;

I suppose this works.Apologies if it does not help out.

Super User
Super User
Posts: 7,039

Re: how to do left join in sas dataset

Posted in reply to johnhuang12

What do you want it to do?  Your choice of variable names makes it look like you want to create FORMAT and apply it to your data.

data dataset1;

  input value @@;

cards;

1 23 3 5 15

run;

data dataset2 ;

  input label $ start end ;

cards;

LOW 0 10

HIGH 10 100

run;

data fmt ;

  retain fmtname 'DECODE' fmttype 'N';

  set dataset2 ;

run;

proc format cntlin=fmt;

run;

data want;

set dataset1;

label = put(value,decode.);

run;

proc print;

run;

Obs    value    label

1        1     LOW

2       23     HIGH

3        3     LOW

4        5     LOW

5       15     HIGH

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 6645 views
  • 0 likes
  • 5 in conversation