Use match-merge or proc sql to combine datasets and conditionally create variables

Accepted Solution Solved
Reply
Contributor kbk
Contributor
Posts: 29
Accepted Solution

Use match-merge or proc sql to combine datasets and conditionally create variables

Hi All,

I have a number of datasets with common and unique variables. I want to create a master list of all the variables and add columns that indicate which datasets have which variables. I figured out a way to do this with proc sql and an outer join (see below) but it only really works with 2 tables. It appears a match-merge approach with a datastep is the way to go but I'm not sure how to add the checklist columns. Any suggestions?

Here is what I have so far:

data work.table1;

    input name $ desc $;

    datalines;

a descA

b descB

c descC

d descD

run;

data work.table2;

    input name $ desc $;

    datalines;

b descB

d descD

e descE

f descF

run;

data work.table3;

    input name $ desc $;

    datalines;

b descB

c descC

f descF

g descG

h descH

run;

* this produces what I want but only works for 2 tables;

* merge the files and add a checklist column for tables 1 and 2;

proc sql;

    CREATE TABLE checkListSQL AS

    SELECT

        CASE

            WHEN a.name = b.name THEN a.name

            WHEN a.name = '' THEN b.name

            WHEN b.name = '' THEN a.name

            ELSE 'NULL'

            END AS Variable length = 10,

        CASE

            WHEN a.name ~= '' then a.Desc

            ELSE b.DESC

            END as Description length = 50,

        CASE

            WHEN a.name = b.name THEN 'X'

            WHEN a.name ~= '' THEN 'X'

            ELSE ''

            END as table1 length = 10,

        CASE

            WHEN b.name = a.name THEN 'X'

            WHEN b.name ~= '' THEN 'X'

            ELSE ''

            END as table2 length =10

    FROM table1 as a

    FULL JOIN table2 as b

    ON a.name = b.name;

quit;

data checkListMerge;

    merge table1 table2 table3;

    by name; *data already sorted;

    *How to add the checklist portion via a data step?;

run;


Accepted Solutions
Solution
‎02-15-2013 09:27 AM
Super Contributor
Posts: 276

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Hi..

How About..

data checkListMerge;

    merge table1(in=a) table2(in=b) table3(in=c);

    by name; ;

  if a then Table1='X'; 

    if b then Table2='X'; 

   if c then Table3='X'; 

run;

Regards.

Sanjeev.K

View solution in original post


All Replies
Solution
‎02-15-2013 09:27 AM
Super Contributor
Posts: 276

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Hi..

How About..

data checkListMerge;

    merge table1(in=a) table2(in=b) table3(in=c);

    by name; ;

  if a then Table1='X'; 

    if b then Table2='X'; 

   if c then Table3='X'; 

run;

Regards.

Sanjeev.K

Contributor kbk
Contributor
Posts: 29

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Posted in reply to kuridisanjeev

That's almost too easy. Thanks!

I saw code similar to this while I was googling the problem but the syntax was so brief I couldn't understand the logic. Could I bother you for a layperson's explanation?

Super Contributor
Posts: 276

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Hi..

For your understanding i modified the code..


data checkListMerge;

    merge table1(in=a) table2(in=b) table3(in=c);

    by name;

  a1=a;

  b1=b;

  c1=c;

run;

   

suppose If the value of A1 variable is 1 ,it means BY variable value(in your code BY variable is Name) is Exist in Table1,if  A1 value is 0 ,which means BY variable value not present in Table1.


For Example Name 'e' is present in only Table2.

So that is what for  'e' name,b1 value is 1 and A1,C1values are 0('e' not exist in Table1 and Table3)..



Hope You understand the logic.


Regards.


Sanjeev.K

Message was edited by: sajeev kuridi

Contributor kbk
Contributor
Posts: 29

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Posted in reply to kuridisanjeev

Thanks again. I'll just add this text from a sas help page.


Understanding the IN= Data Set Option

When you create a new data set by combining observations from two or more data sets, knowing which data set an observation came from can be useful. For example, you might want to perform a calculation based on which data set contributed an observation. Otherwise, you might lose important contextual information that you need for later processing. You can determine which data set contributed a particular observation by using the IN= data set option.

The IN= data set option enables you to determine which data sets have contributed to the observation that is currently in the program data vector. The syntax for this option on the SET statement is

SET SAS-data-set-1 (IN=variable) SAS-data-set-2;
BY a-common-variable;

When you use the IN= option with a data set in a SET, MERGE, MODIFY, or UPDATE statement, SAS creates a temporary variable associated with that data set. The value of variable is 1 if the data set has contributed to the observation currently in the program data vector. The value is 0 if it has not contributed. You can use the IN= option with any or all the data sets you name in a SET, MERGE, MODIFY, or UPDATE statement, but use a different variable name in each case.

Note:   The IN= variable exists during the execution of the DATA step only; it is not written to the output data set that is created.

Super Contributor
Posts: 276

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Yup..

You can also Perform All Types of joins in datastep with IN= options.

For more info..

http://www.nesug.org/Proceedings/nesug11/ds/ds03.pdf

Regards,

Sanjeev.K

Super Contributor
Posts: 276

Re: Use match-merge or proc sql to combine datasets and conditionally create variables

Found this one also..

http://www2.sas.com/proceedings/sugi30/249-30.pdf

Reagrds.

Sanjeev.K

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 351 views
  • 4 likes
  • 2 in conversation