Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- Special Merge

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 03-05-2008 06:27 PM
(1015 views)

I am needing some help merging a data set that I already have in SAS, with one I have created in Excel. This isn't an ordinary merge, so I'm assuming a merge is the correct procedure.

The small Excel data set has 4 variables which are contained in the large SAS set, as well as two new ones which I computed. What I want to do is match every observation in the SAS set according to the values of the 4 matching variables in the Excel set, and then add the new variable and assign it the corresponding value in excel.

For instance,

**EXCEL **

*A | B | NEWVAR*

1 | 2 | 67

2 | 4 | 89

3 | 6 | 34

etc.... etc....

**SAS**

__A | B__

1 | 2

2 | 4

3 | 6

etc...etc...

A shortcut for doing:

IF A eq 1 and B eq 2 then NEWVAR = 67;

...

So, in short, Im adding a new variable with those specifications.

Any help would be great! Thanks!!

The small Excel data set has 4 variables which are contained in the large SAS set, as well as two new ones which I computed. What I want to do is match every observation in the SAS set according to the values of the 4 matching variables in the Excel set, and then add the new variable and assign it the corresponding value in excel.

For instance,

1 | 2 | 67

2 | 4 | 89

3 | 6 | 34

etc.... etc....

1 | 2

2 | 4

3 | 6

etc...etc...

A shortcut for doing:

IF A eq 1 and B eq 2 then NEWVAR = 67;

...

So, in short, Im adding a new variable with those specifications.

Any help would be great! Thanks!!

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

One suggestion I would make is to program in SAS the computations you did in Excel. To me this would be the simplest given the information you've provided so far.

If you would prefer to do the calcs in Excel and merge with then I would suggest importing your Excel data into SAS and then running a simple sort and merge. You can merge on more than one variable. You just have to list them in the BY statement (be sure to sort in the same order that you merge). Here's an example:

[pre]

data excel;

input A B Newvar;

cards;

1 2 67

2 4 89

3 6 34

;

run;

data sas;

input A B;

cards;

1 2

2 4

3 6

;

run;

proc sort data=excel; by a b; run;

proc sort data=sas; by a b; run;

data merged;

merge excel sas;

by a b;

run;

[/pre]

If you would prefer to do the calcs in Excel and merge with then I would suggest importing your Excel data into SAS and then running a simple sort and merge. You can merge on more than one variable. You just have to list them in the BY statement (be sure to sort in the same order that you merge). Here's an example:

[pre]

data excel;

input A B Newvar;

cards;

1 2 67

2 4 89

3 6 34

;

run;

data sas;

input A B;

cards;

1 2

2 4

3 6

;

run;

proc sort data=excel; by a b; run;

proc sort data=sas; by a b; run;

data merged;

merge excel sas;

by a b;

run;

[/pre]

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi:

It's not really a special merge. It's just a merge. Of course, you first have to read your XLS data into a SAS dataset. That's going to require PROC IMPORT or the Excel LIBNAME engine.

So first, I made some data using SAS. In the second step, I called the file PRETEND_XLSDATA -- you would not have EITHER of these steps in your code. But I needed to have some data for the MERGE. Look at the data I made. I included a row in each file that does not exist in the other file, because it's important to consider how you want this situation handled in the output.

[pre]

** make some test data;

data sasdata;

infile datalines;

input a b;

return;

datalines;

1 2

2 4

3 6

5 7

;

run;

** make some data that LOOKS like the data described from Excel;

data pretend_xlsdata;

infile datalines;

input a b newvar;

return;

datalines;

1 2 67

2 4 89

3 6 34

8 3 99

;

run;

[/pre]

Now, all it takes is a DATA step program with a MERGE. When merging, it's always best to use BY variables. You could have done this in an SQL join, too. But you can only create 1 file with each SQL query...and I wanted to illustrate the various possibilities with the MERGE.

I'm going to create 4 different files, by testing the IN= variables for each dataset.The IN= variables are a "flag" or signal about which variables are in which dataset in the MERGE. So, for example, if an observation is coming from the SASDATA data set, then the value of the variable INSAS will be set to 1 (otherwise, it will be 0). If an observation is coming from the PRETEND_XLSDATA file, then the value of INXLS variable will be set to 1 or else it will be 0. These are called Boolean variables where 1=yes and 0=no. So I can have this test:

IF INSAS is the same as IF INSAS=1 then...

IF INXLS is the same as IF INXLS=1 then ...

IF INSAS and NOT INXLS is the same as IF INSAS=1 and INXLS=0 then ...

[pre]

** perform the merge;

** show possible combinations of output tables;

** based on testing the IN= variable;

data makeall inboth sasonly xlsonly;

merge sasdata(in=insas)

pretend_xlsdata(in=inxls);

by a b;

if insas and inxls then output inboth;

if insas or inxls then output makeall;

if insas and not inxls then output sasonly;

else if inxls and not insas then output xlsonly;

run;

** PRINT each file;

ods listing;

proc print data=makeall;

title 'ALL observations from both files';

title2 'note how A=5 and B=7 and A=8 and B=3 obs were included';

run;

proc print data=inboth;

title 'A and B were the same in both files';

title2 'note how A=5 and B=7 and A=8 and B=3 obs were NOT included';

run;

proc print data=sasonly;

title 'Obs that were in SASONLY file but not XLS';

title2 'Good for error checking';

run;

proc print data=xlsonly;

title 'Obs that were in XLS file but not SAS';

title2 'Good for error checking';

run;

[/pre]

If you have duplicate values of A and B in both files, then a MERGE might not be the best technique to use. For more help with this problem, your best bet is to contact Tech Support so they can help you with the best solution for your particular data and your particular merge requirements.

cynthia

Output from the PROC PRINTS:

[pre]

ALL observations from both files

note how A=5 and B=7 and A=8 and B=3 obs were included

Obs a b newvar

1 1 2 67

2 2 4 89

3 3 6 34

4 5 7

5 8 3 99

******************************************************

A and B were the same in both files

note how A=5 and B=7 and A=8 and B=3 obs were NOT included

Obs a b newvar

1 1 2 67

2 2 4 89

3 3 6 34

******************************************************

Obs that were in SASONLY file but not XLS

Good for error checking

Obs a b newvar

1 5 7

******************************************************

Obs that were in XLS file but not SAS

Good for error checking

Obs a b newvar

1 8 3 99

[/pre]

It's not really a special merge. It's just a merge. Of course, you first have to read your XLS data into a SAS dataset. That's going to require PROC IMPORT or the Excel LIBNAME engine.

So first, I made some data using SAS. In the second step, I called the file PRETEND_XLSDATA -- you would not have EITHER of these steps in your code. But I needed to have some data for the MERGE. Look at the data I made. I included a row in each file that does not exist in the other file, because it's important to consider how you want this situation handled in the output.

[pre]

** make some test data;

data sasdata;

infile datalines;

input a b;

return;

datalines;

1 2

2 4

3 6

5 7

;

run;

** make some data that LOOKS like the data described from Excel;

data pretend_xlsdata;

infile datalines;

input a b newvar;

return;

datalines;

1 2 67

2 4 89

3 6 34

8 3 99

;

run;

[/pre]

Now, all it takes is a DATA step program with a MERGE. When merging, it's always best to use BY variables. You could have done this in an SQL join, too. But you can only create 1 file with each SQL query...and I wanted to illustrate the various possibilities with the MERGE.

I'm going to create 4 different files, by testing the IN= variables for each dataset.The IN= variables are a "flag" or signal about which variables are in which dataset in the MERGE. So, for example, if an observation is coming from the SASDATA data set, then the value of the variable INSAS will be set to 1 (otherwise, it will be 0). If an observation is coming from the PRETEND_XLSDATA file, then the value of INXLS variable will be set to 1 or else it will be 0. These are called Boolean variables where 1=yes and 0=no. So I can have this test:

IF INSAS is the same as IF INSAS=1 then...

IF INXLS is the same as IF INXLS=1 then ...

IF INSAS and NOT INXLS is the same as IF INSAS=1 and INXLS=0 then ...

[pre]

** perform the merge;

** show possible combinations of output tables;

** based on testing the IN= variable;

data makeall inboth sasonly xlsonly;

merge sasdata(in=insas)

pretend_xlsdata(in=inxls);

by a b;

if insas and inxls then output inboth;

if insas or inxls then output makeall;

if insas and not inxls then output sasonly;

else if inxls and not insas then output xlsonly;

run;

** PRINT each file;

ods listing;

proc print data=makeall;

title 'ALL observations from both files';

title2 'note how A=5 and B=7 and A=8 and B=3 obs were included';

run;

proc print data=inboth;

title 'A and B were the same in both files';

title2 'note how A=5 and B=7 and A=8 and B=3 obs were NOT included';

run;

proc print data=sasonly;

title 'Obs that were in SASONLY file but not XLS';

title2 'Good for error checking';

run;

proc print data=xlsonly;

title 'Obs that were in XLS file but not SAS';

title2 'Good for error checking';

run;

[/pre]

If you have duplicate values of A and B in both files, then a MERGE might not be the best technique to use. For more help with this problem, your best bet is to contact Tech Support so they can help you with the best solution for your particular data and your particular merge requirements.

cynthia

Output from the PROC PRINTS:

[pre]

ALL observations from both files

note how A=5 and B=7 and A=8 and B=3 obs were included

Obs a b newvar

1 1 2 67

2 2 4 89

3 3 6 34

4 5 7

5 8 3 99

******************************************************

A and B were the same in both files

note how A=5 and B=7 and A=8 and B=3 obs were NOT included

Obs a b newvar

1 1 2 67

2 2 4 89

3 3 6 34

******************************************************

Obs that were in SASONLY file but not XLS

Good for error checking

Obs a b newvar

1 5 7

******************************************************

Obs that were in XLS file but not SAS

Good for error checking

Obs a b newvar

1 8 3 99

[/pre]

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.