DATA Step, Macro, Functions and more

How to use Proc SQL to merge data

Reply
Regular Learner
Posts: 1

How to use Proc SQL to merge data

[ Edited ]

Hi All,

 

I would like use "proc sql" in SAS to merge two Excel documents data

 

First, I import data from Excel to SAS

Two names in SAS are Ipo & Calendar.

 

Then, I hope to merge according to

Ipo - Date I & CalendarDate L  

(Document Name - Variable)

 

I code as below:

But it doesn't work

 

proc sql;
create table merge1
as select * from Ipo, Calendar
where Ipo. Date I= Calendar. Date L
order by Date I;
quit;

 

 

Besides, In Ipo document's variable, there are two or more same Date-I variables,

I hope to merge and keep all of it in "merge1".

 

For example:

 

In Ipo document:

 

Date I         xx,  xx,         xxxxxxx

2005/3/22,  4,  -6.54%,  .................................

2005/3/22,  5,  -23.71%, ................................

2005/3/22,  6,  -5.71%, ..............................

 

 

In Calendar document:

 

Date L       xx, xx, xxxxxx 

2005/3/22, 1, 1, 1, ....................

 

 

Hope to In merge1,

 

Date I           xx, xx, xxxxxx                       xx, xx, xxxxxx

2005/3/22,  4,  -6.54%, ..........................   1, 1, 1, ....................

2005/3/22,  5,  -23.71%, .......................... 1, 1, 1, ....................

2005/3/22,  6,  -5.71%, ............................ 1, 1, 1, ....................

 

 

How do I code?

 

Please see the attachments as the reference (Excel Raw Data)

 

 

Many Thanks!!

Super User
Super User
Posts: 7,955

Re: How to use Proc SQL to merge data

Hi,

 

Well, several things.  Firstly lets start with Excel, it is not a great bit of software to use for getting data into a strucutured environment.  As this isn't the basis of the question however, I would suggest you post test data, in the form of a datastep, which we can copy and run to get the data ready for merge, I don't have time to mess about trying to get some dodgy Excel file imported.  

 

in terms of your coding, the code you provided wont work straight off as you have spaces in odd places.  Also I would suggest you read up on how to use SQL, there are many free sites for it out there, W£ schools for instance.  Now the code you posted:

proc sql;
  create table MERGE1 as 
select IPO.DATE
,IPO.
VAR1
,IPO.VAR1
,CALENDAR.VAR1
,CALENDAR.VAR2
from IPO IPO
,CALENDAR CALENDAR where IPO.DATE=CALENDAR.DATE order by IPO.DATE; quit;

The above code wont work as I don't have your data.  It is here to explain some fundamentals.  Firstly you need to provide the from datasets, and to use them individually you would assign an alias, so the IPO dataset is given an alias IPO.   To select individual variables you probide the alias, and the variable name.  Its also a good idea to check what type of merge you want, there are various different ones, an example of which is:

Capture.PNG

Frequent Contributor
Posts: 130

Re: How to use Proc SQL to merge data

As RW9 previously mentioned, it is important to understand the different types of joins with the SQL procedure.  From what I understand with your examples (and you may want to post some example data from both data sets to help us understand), but I believe the IPO data set looks to have duplicate values based on the variable Date I, and the Calendar data set seems to have one record per Date L variable.  If this is the case and you want to keep all of your records from the IPO data set, I would suggest using IPO as the left data set and doing a left join, as this will keep every record from the IPO data set and bring in values from the Calendar data set where the two dates are matching.  The venn diagrams RW9 provided below are a great visual of how the different joins work.

 

The other thing I noticed were the names of your date variables.  If they have spaces in the names from a import procedure, you will need to inclose them between double quotes and the character n at the end.  For example Date I would be "Date I"n, otherwise you will get syntax errors.  You also need to specify the alias for each data set in your FROM statement as you are using an alias in front of the date variables, but never declared them.  Below would provide you every variable from both data sets (you'll get a warning message if you have two or more of the same variable names in each data set, and it will drop one of them); as well as, keep every record from your IPO data set and provide values for the Calendar variables where the Date L variable matches the Date I variable.

 

Hope this helps!

 

proc sql;
create table merge1 as
select *
from Ipo as Ipo left join Calendar as Calendar
where Ipo."Date I"n=Calendar."Date L"n
order by Ipo."Date I"n;
quit;
Ask a Question
Discussion stats
  • 2 replies
  • 754 views
  • 0 likes
  • 3 in conversation