I am trying to merge together two SAS datasets. I have a simple example below. When I merge the two datasets together by the ZIP I am expecting 14 unique records....SAS is only returning 7. I have included the datasets, the actual output, the output I was expecting and the code to merge the datasets together. Can anyone tell me what I'm doing wrong?
FOURHOUR DATASET
Obs ZIP BOSHIP
1 40219 02TK
2 40219 02TJ
3 40219 02TM
4 40219 02TV
5 40219 02TU
6 40219 02TR
7 40219 02TE
ORDER DATASET
ORDER_
Obs ZIP NUMBER
1 40219 B401632K
2 40219 R765534K
MERGE STATEMENT:
PROC SORT DATA=FOURHOUR;
BY ZIP;
RUN;
PROC SORT DATA=ORDER;
BY ZIP;
RUN;
DATA FOURHOUR;
MERGE ORDER(IN=A) FOURHOUR(IN=B);
BY ZIP;
IF A AND B;
OUTPUT:
ORDER_
Obs ZIP NUMBER BOSHIP
1 40219 B401632K 02TK
2 40219 R765534K 02TJ
3 40219 R765534K 02TM
4 40219 R765534K 02TV
5 40219 R765534K 02TU
6 40219 R765534K 02TR
7 40219 R765534K 02TE
OUTPUT I EXPECTED (AND NEED)
ORDER_
Obs ZIP NUMBER BOSHIP
1 40219 B401632K 02TK
2 40219 B401632K 02TJ
3 40219 B401632K 02TM
4 40219 B401632K 02TV
5 40219 B401632K 02TU
6 40219 B401632K 02TR
7 40219 B401632K 02TE
8 40219 R765534K 02TK
9 40219 R765534K 02TJ
10 40219 R765534K 02TM
11 40219 R765534K 02TV
12 40219 R765534K 02TU
13 40219 R765534K 02TR
14 40219 R765534K 02TE
SAS's data step merge when there are duplicate keys in both datasets are decidedly unconventional, to say the least. It is not the SQL equivalent of a cartesian product, which is what you are looking for. Try something like below to get what you want:
proc sql;
create table c as (
select * from a,b
where a.key = b.key
);
quit;
In your case, it's "select * from fourhour, order where fourhour.zip = order.zip".
I think Proc Sql maybe easy to do it.
Try this code.
data a;
input
n zip boship $ ;
datalines
;
1 40219 02TK
2 40219 02TJ
3 40219 02TM
4 40219 02TV
5 40219 02TU
6 40219 02TR
7 40219 02TE
;
data b;
input
n zip order_num $ ;
datalines;
1 40219 B401632K
2 40219 R765534K
;
run;
porc sql;
select a.*, order_num
from a as a, b;
;
Quit;
SAS's data step merge when there are duplicate keys in both datasets are decidedly unconventional, to say the least. It is not the SQL equivalent of a cartesian product, which is what you are looking for. Try something like below to get what you want:
proc sql;
create table c as (
select * from a,b
where a.key = b.key
);
quit;
In your case, it's "select * from fourhour, order where fourhour.zip = order.zip".
Thanks to both of the above posts regarding the Proc SQL suggestion!. The following worked.
PROC Sql;
CREATE TABLE ALLMERGE AS
SELECT * FROM
ORDER A,FOURHOUR B
WHERE A.ZIP=B.ZIP;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.