I am trying to get all combinations ( Cartesian product) of two datasets in sas using a unique ID.
/* Create Data A */
data a;
input id gl1;
datalines;
1 100
1 300
2 400
3 100
3 400
;
run;
/* Create Data B */
data b;
input id gl2;
datalines;
1 300
1 400
2 100
2 200
;
run;
I want the output to look like this:
id gl1 gl2
1 100 300
1 100 400
1 300 300
1 300 400
2 400 100
2 400 200
3 100 .
4 . 400
Can anyone please help?
This code below only allows ids that are both in dataset a and b,
proc sql;
create table c as
select *
from a, b
where a.ID=b.ID or ;
quit;
Hi @nehachawla, if you use a full join like
select coalesce(a.id, b.id) gl1 gl2
from a full join b
on a.ID = b.ID;
quit;
you should also get the the IDs which are only in one of the data sets.
Hi @nehachawla, if you use a full join like
select coalesce(a.id, b.id) gl1 gl2
from a full join b
on a.ID = b.ID;
quit;
you should also get the the IDs which are only in one of the data sets.
Thank you for your prompt response, I have never used this coalese function and I get this error. can you assist?
1039 proc sql;
1040 create table c as
1041
1042 select coalesce(a.id, b.id) gl1 gl2
---
22
202
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, !, !!, &, *, **,
+, ',', -, /, <, <=, <>, =, >, >=, ?, AND, AS, BETWEEN, CONTAINS, EQ, EQT,
FORMAT, FROM, GE, GET, GT, GTT, IN, INFORMAT, INTO, IS, LABEL, LE, LEN, LENGTH,
LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, TRANSCODE, ^, ^=, |, ||, ~, ~=.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
1043 from a full join b
1044 on a.ID = b.ID;
1045 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Hi @nehachawla,
I think I was to quick.
please try
select coalesce(a.id, b.id) as id format = 10, gl1, gl2
instead of
select coalesce(a.id, b.id) gl1 gl2
Mike using this , I still get this error- see attached.
Hi @nehachawla,
could you please post the error into the chat, as I can't open attachments.
157 proc sql;
1158 create table c as
1159 select coalesce(a.id, b.id) as id format = 10, gl1, gl2
--
22
76
ERROR 22-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
1160 from a full join b
1161 on a.ID = b.ID;
1162 quit;
Hi @nehachawla,
I missed the . in the format:
select coalesce(a.id, b.id) as id format = 10. gl1, gl2
You have to list the variables you want in the SELECT clause separated by COMMAS (welcome to working with SQL syntax 🙂 ). You also need to give this new variable a NAME.
proc sql;
create table c as
select coalesce(a.id, b.id) as id
, a.gl1
, b.gl2
from a full join b
on a.ID = b.ID
;
quit;
Result
Obs id gl1 gl2 1 1 100 300 2 1 100 400 3 1 300 300 4 1 300 400 5 2 400 100 6 2 400 200 7 3 400 . 8 3 100 .
Note if you know that ID is the only variable in common between the two datasets you can just a NATURAL join. SAS will then generate the COALESCE() and ON for you so the syntax is a lot simpler.
proc sql;
create table c as
select *
from b natural full join a
;
quit;
Thanks @Tom
Thanks Markus for the lead
Thank you Tom- This worked like a charm
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.