Merging the data sets using macro code

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Merging the data sets using macro code

I am currently taking an intro to SAS class and was instructed to complete the same thing this code does using Macro statements. 

 

proc import out=datafile1
datafile= 'C:\Users\savanahb\Downloads\datafile1-2.xlsx'
DBMS= EXCEL2000 REPLACE;
getnames=yes;
run;

proc import out=datafile2
datafile= 'C:\Users\savanahb\Downloads\datafile2-2.xlsx'
DBMS= EXCEL2000 REPLACE;
getnames=yes;
run;

proc import out=datafile3
datafile= 'C:\Users\savanahb\Downloads\datafile3-2.xlsx'
DBMS= EXCEL2000 REPLACE;
getnames=yes;
run;

data datafile123;
merge datafile1 datafile2 datafile3;
by id;
run;

data datafile123;
merge datafile1 (in=a) datafile2;
by id;
if a;
run;

data datafile123;
merge datafile2 (in=a) datafile3;
by ID;
if a;
run;

data datafile123;
merge datafile1 datafile3 (in=a);
by ID;
if a;
run;

data datafile123;
merge datafile1 (in=a) datafile2 (in=b) datafile3 (in=c);
by ID;
if a and b and c;
run;

proc print data=datafile123;
title 'Tabulation of data from merged dataset Datafile123';
run;

 

Unfortunately, I am stuck trying to figure out how to only show the variables that have the same ID in all three datasets. Here is what I have come up with:

 

%let u=datafile1;
%let v=datafile1-3.xlsx;


proc import out= &u
datafile= "C:\Users\savanahb\Downloads\&v"
dbms=xlsx replace;
getnames=yes;
run;

%let w=datafile2;
%let x=datafile3-3.xlsx;


proc import out= &w
datafile= "C:\Users\savanahb\Downloads\&x"
dbms=xlsx replace;
getnames=yes;
run;

%let y=datafile3;
%let z=datafile3-3.xlsx;


proc import out= &y
datafile= "C:\Users\savanahb\Downloads\&z"
dbms=xlsx replace;
getnames=yes;
run;

%macro combine(u,w,y);
data datafile123;
merge &u &w &y;
by id;

data datafile123
match merge &u &y;
by ID;

data datafile123
merge &w(in=a) &y;
by ID;
if a;

data datafile123
merge &u(in=a) &w(in=b) &y(in=c);
by ID;
if a and b and c;
run;
%mend combine;

proc print data=datafile123;
run;

 

Any suggestions?

Thank you!


Accepted Solutions
Solution
‎11-08-2017 09:49 AM
Respected Advisor
Posts: 4,665

Re: Merging the data sets using macro code

[ Edited ]

@savanahb

Many forum members don't post solutions for homework but just provide direction. Given that you're demonstrating actual own work and effort I feel posting below (not tested) code won't just be the lazy path for you but will give you some direction and idea how you could approach this.

%macro combine();

  %do i=1 %to 3;
    proc import out=datafile&i
      datafile= "C:\Users\savanahb\Downloads\datafile&i.-2.xlsx"
      DBMS= EXCEL2000 REPLACE;
      getnames=yes;
    run;

    proc sort data=datafile&i;
      by id;
    run;
  %end;

  data want;
    merge 
      datafile1 (in=in1)
      datafile2 (in=in2)
      datafile3 (in=in3)
      ;
    by id;
    if in1 and in2 and in3;
  run;

%mend;

%combine();

 

View solution in original post


All Replies
Super User
Posts: 6,622

Re: Merging the data sets using macro code

Just a few guidelines to get you started ...

 

First, check the assignment to make sure you are looking at the proper program.  It seems highly unusual that a program should create 5 different versions of datafile123, each one replacing the previous version.

 

Second, you did well to switch from single quotes to double quotes.  Macro variables would not get resolved within single quotes.

 

Third, you need to know how to define a macro (beginning with a %macro statement and ending with a %mend statement).  Assigning values to a few macro variables is not what this assignment is about.  Inside your macro, you will end up with a loop along the lines of:

 

%do i=1 %to 3;

   

%end;

 

Each iteration through the loop will run one of the PROC IMPORTs.

 

The code after the three PROC IMPORTs may or may not need to be part of the macro.  As it stands now, there is no reason to use macro language.  But remember, I'm not convinced that the code presented is the proper code for this assignment.  That needs to be checked.

Occasional Contributor
Posts: 14

Re: Merging the data sets using macro code

Posted in reply to Astounding

@Astounding,

Thank you for you feedback. The code I initially posted was what I submitted for the last homework and received full credit. Basically, what the professor wanted us to do was to import the 3 excel files and then combine them to produce a table that only showed the variables with ID's that were found in all 3 files. The way he gave us directions was step by step combining each data file, hence the multiple datafile123. I think I understand what you are saying regarding the

%do i=1 %to 3;

I was able to get the program to run using this code:

 

%macro combine;
data datafile123;
merge
%do i = 1 %to 3;
datafile&i
%end;
;
by ID;
run;
%mend;

%combine;

 

But it again did not merge the data so that it only showed the overlapping variables if that makes sense. That is why I had separated them out to different statements. When I tried to use a similar formula for the import, I kept getting stuck on where the datafile was coming from.

 

 

 

Super User
Posts: 23,224

Re: Merging the data sets using macro code

I could see doing it that way for learning purposes. In a way it shows a student how to iteratively build the code they need. 

 

Now the assignment is to convert it to a macro. I would highly suspect some of those intermediary steps are no longer necessary and possibly why your assignment received full marks - your prof only checked the final results wanted. 

 

Im not sure what the question is here though? @savanahb what exactly do you need help with?

 

Solution
‎11-08-2017 09:49 AM
Respected Advisor
Posts: 4,665

Re: Merging the data sets using macro code

[ Edited ]

@savanahb

Many forum members don't post solutions for homework but just provide direction. Given that you're demonstrating actual own work and effort I feel posting below (not tested) code won't just be the lazy path for you but will give you some direction and idea how you could approach this.

%macro combine();

  %do i=1 %to 3;
    proc import out=datafile&i
      datafile= "C:\Users\savanahb\Downloads\datafile&i.-2.xlsx"
      DBMS= EXCEL2000 REPLACE;
      getnames=yes;
    run;

    proc sort data=datafile&i;
      by id;
    run;
  %end;

  data want;
    merge 
      datafile1 (in=in1)
      datafile2 (in=in2)
      datafile3 (in=in3)
      ;
    by id;
    if in1 and in2 and in3;
  run;

%mend;

%combine();

 

Occasional Contributor
Posts: 14

Re: Merging the data sets using macro code

@Reeza
Basically what I am looking for is guidance on how to get the same result but using macros instead and if I’m even on the right page here starting out.
and @Patrick
Thank you! I totally get it and I want to actually learn this because I’ll be using it down the road in other classes. Normally what I am able to do is find various instructions online and in the book and the notes and extrapolate a way to do it, but I was having a lot of trouble doing this with macros. I am not a computer person at all so simplifying the codes is not my forte. I think I understand now from your post what I am missing. It seems to be a combination of sorts of the more recent code I posted and the original.
Super User
Posts: 23,224

Re: Merging the data sets using macro code

@savanahb I would expect the code to be able to handle multiple data sets, so that IN condition would also need to be dynamic. At least that's my interpretation of the question.

Respected Advisor
Posts: 4,665

Re: Merging the data sets using macro code

@savanahb

The posted code was for guidance. When using SAS macros it's always a judgment call how far you want/need to take things and what needs to be made dynamic - and the further you take it the harder the resulting code will be to "read".

 

As @Reeza posted, you could also "macrotize" below bit - the list of table name in the merge statement and the IF in1... statement.

  data want;
    merge 
      datafile1 (in=in1)
      datafile2 (in=in2)
      datafile3 (in=in3)
      ;
    by id;
    if in1 and in2 and in3;
  run;

You could take it even further and implement an input parameters for the macro where you pass in root path and then a list of source files and then have the macro generate SAS code which can deal with a varying number of source files.

How far you need to take this depends on your assignment as well as how much time you can/want to spend on this.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 728 views
  • 3 likes
  • 4 in conversation