SQL Join with Two Tables, Need to Pull in Extra Data

Reply
Occasional Contributor
Posts: 15

SQL Join with Two Tables, Need to Pull in Extra Data

New to SAS. I'm working on an attrition table for membership. I need to see which members stayed, left, or were new from 2014 to 2015. I have a table that is sorted by their ID in year 14 and year 15. I've been able to get it to a point where there is a flag for NEW, TERM, and RETAIN. However, the only data that I can see is the ID and their flagged status.

So I thought, I'll SQL a table to join actual data (i.e. name, dob, address, etc.) to the table above. For some reason, when I pull member data into the table above, it deletes their info for about 10,000 lines.

My code for the first table looks like this:

proc sql;

    create table combine as

        select a.id as id14, b.idas id15, coalesce(a.id,b.id) as id_all

        from _2014active a

        full join _2015active b on a.id= b.id;

quit;

data everything14to15;

    set combine;

    if id14=id15then flag = 'Ret';       

    if id14='' then flag = 'New';

    if id15='' then flag = 'Ter';

run;

THIS WORKS.

To join "everything14to15" I have the following, and it does not work....

proc summary data = sas.memberdata nway missing;

    class name dob address;

    output out = everything1 (drop = _type_ _freq_);

run;

proc sql;

    create table everything2 as

    select a.*, b.*

    from everything14to15 a

    left join everything1 b on a.id15=b.id;

    quit;

I apologize if this is confusing, but any help is appreciated. I need the first table and the second table to join together to show member information - not just their ID and status.

Respected Advisor
Posts: 4,919

Re: SQL Join with Two Tables, Need to Pull in Extra Data

Posted in reply to emmytheduck

Seems to me that merging your datasets would be simpler :

data members;

length status $3;

merge _2014active (in=in2014) _2015active (in=in2015);

by id;

if in2014 and in2015 then status="Ret";

else if in2014 then status="Ter";

else status="New";

run;

(untested)

PG

PG
Occasional Contributor
Posts: 15

Re: SQL Join with Two Tables, Need to Pull in Extra Data

PG - this was very helpful. I've never used merge, and at our firm we typically stick with PROC SQL to put our tables together...to keep everything consistent we all use it. I'm going to have to look more into this. It worked perfectly!

Occasional Contributor
Posts: 15

Re: SQL Join with Two Tables, Need to Pull in Extra Data

I do have a couple of questions. The (in=in2014) - is that naming a field associated with id? Because 'in2014' is not a field in the _2014active table. I'm just trying to understand the step. It works so well and I'd like to use it in the future!

Respected Advisor
Posts: 4,919

Re: SQL Join with Two Tables, Need to Pull in Extra Data

Posted in reply to emmytheduck

I use SQL a lot myself, but for some purposes the datastep simply works better. IN=in2015 is a dataset option that creates a boolean variable named in2015 which indicates if _2015active contributed to the merge. Thus, in the statement merge _2014active (in=in2014) _2015active (in=in2015), in2015 tells you if id was present in dataset _2015active.

By default, automatic variables created with the in= option are not kept in output datasets.

PG

PG
Occasional Contributor
Posts: 15

Re: SQL Join with Two Tables, Need to Pull in Extra Data

Thank you very much for the clarification!

Respected Advisor
Posts: 4,919

Re: SQL Join with Two Tables, Need to Pull in Extra Data

Posted in reply to emmytheduck

If you really need to do this with SQL then try:

proc sql;

create table members as

select * from

     (select *,

          "Ter" as status

     from _2014active

     where id not in (select id from _2015active) )

union all

     (select *,

         case when id in (select id from _2014active) then "Ret" else "New" end as status

     from _2015active )

order by id;

quit;

PG

PG
Occasional Contributor
Posts: 15

Re: SQL Join with Two Tables, Need to Pull in Extra Data

Yeah, I was able to get most of it done with SQL, but I prefer the merge in this scenario. So thank you again for the knowledge!

Ask a Question
Discussion stats
  • 7 replies
  • 317 views
  • 7 likes
  • 2 in conversation