I am trying to get all combinations ( Cartesian product) of two datasets in sas using a unique ID.
/* Create Data A */
data a;
input id gl1;
datalines;
1 100
1 300
2 400
3 100
3 400
;
run;
/* Create Data B */
data b;
input id gl2;
datalines;
1 300
1 400
2 100
2 200
;
run;
I want the output to look like this:
id gl1 gl2
1 100 300
1 100 400
1 300 300
1 300 400
2 400 100
2 400 200
3 100 .
4 . 400
Can anyone please help?
This code below only allows ids that are both in dataset a and b,
proc sql;
create table c as
select *
from a, b
where a.ID=b.ID or ;
quit;
Hi @nehachawla, if you use a full join like
select coalesce(a.id, b.id) gl1 gl2
from a full join b
on a.ID = b.ID;
quit;
you should also get the the IDs which are only in one of the data sets.
Hi @nehachawla, if you use a full join like
select coalesce(a.id, b.id) gl1 gl2
from a full join b
on a.ID = b.ID;
quit;
you should also get the the IDs which are only in one of the data sets.
Thank you for your prompt response, I have never used this coalese function and I get this error. can you assist?
1039 proc sql;
1040 create table c as
1041
1042 select coalesce(a.id, b.id) gl1 gl2
---
22
202
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, *, **,
+, ',', -, /, <, <=, <>, =, >, >=, ?, AND, AS, BETWEEN, CONTAINS, EQ, EQT,
FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH,
LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
1043 from a full join b
1044 on a.ID = b.ID;
1045 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Hi @nehachawla,
I think I was to quick.
please try
select coalesce(a.id, b.id) as id format = 10, gl1, gl2
instead of
select coalesce(a.id, b.id) gl1 gl2
Mike using this , I still get this error- see attached.
Hi @nehachawla,
could you please post the error into the chat, as I can't open attachments.
157 proc sql;
1158 create table c as
1159 select coalesce(a.id, b.id) as id format = 10, gl1, gl2
--
22
76
ERROR 22-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
1160 from a full join b
1161 on a.ID = b.ID;
1162 quit;
Hi @nehachawla,
I missed the . in the format:
select coalesce(a.id, b.id) as id format = 10. gl1, gl2
You have to list the variables you want in the SELECT clause separated by COMMAS (welcome to working with SQL syntax 🙂 ). You also need to give this new variable a NAME.
proc sql;
create table c as
select coalesce(a.id, b.id) as id
, a.gl1
, b.gl2
from a full join b
on a.ID = b.ID
;
quit;
Result
Obs id gl1 gl2 1 1 100 300 2 1 100 400 3 1 300 300 4 1 300 400 5 2 400 100 6 2 400 200 7 3 400 . 8 3 100 .
Note if you know that ID is the only variable in common between the two datasets you can just a NATURAL join. SAS will then generate the COALESCE() and ON for you so the syntax is a lot simpler.
proc sql;
create table c as
select *
from b natural full join a
;
quit;
Thanks @Tom
Thanks Markus for the lead
Thank you Tom- This worked like a charm
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.