BookmarkSubscribeRSS Feed
rdum96
Calcite | Level 5

Hi,

So I'm merging three sources and trying to stack the unique IDs and also flag which table(s) they were found in. However it's leading to duplicate IDs and flags on separate rows. 

Input:

Table A         Table B      Table C

ID                  ID               ID

1                    2                 2

2                    4                 3

3                    9                 8

4

 

Expected output:

ID       in_A     in_B     in_C

1         1          0          0

2         1          1          1

3         1          0          1

4         1          1          0

8         0          0          1

9         0          1          0

 

Instead I am getting the following output:

ID       in_A     in_B     in_C

1         1          0          0

2         1          0          0

2         0          1          0

2         0          0          1

3         1          0          0

3         0          0          1

4         1          0          0

4         0          1          0

8         0          0          1

9         0          1          0

 

 

proc sql;
   create table all_ids as select
   distinct coalesce(a.ID,b.ID,c.ID) as all_ID, 
   (case when calculated all_ID=a.ID then 1 else 0 end) as in_A,
   (case when calculated all_ID=b.ID then 1 else 0 end) as in_B,
   (case when calculated all_ID=c.ID then 1 else 0 end) as in_C
   from table_A as a full join table_B as b on a.ID=b.ID full join table_C as c on a.ID=c.ID;
quit;
4 REPLIES 4
rdum96
Calcite | Level 5
Edit: I need to do this through proc sql and not data steps!
ballardw
Super User

It is extremely likely that you are seeing the result of a previous run of code as when I run your code the sql shows this in the Log:

91   proc sql;
92      create table all_ids as select
93      distinct coalesce(a.ID,b.ID,c.ID) as all_ID,
94      (case when calculated all_ID=a.ID then 1 else 0 end) as in_A,
95      (case when calculated all_ID=b.ID then 1 else 0 end) as in_B,
96      (case when calculated all_ID=c.ID then 1 else 0 end) as in_C
97      from table_A as a full join table B as b on a.ID=b.ID full join table as c on a.ID=c.ID;
                                            --
                                            73
                                            201
ERROR 73-322: Expecting an ON.

ERROR 201-322: The option is not recognized and will be ignored.

98   quit;

Which means if you have a data set named All_ids it was created in a previous step as the SQL you show has an error.

 

Why is there a requirement to use SQL? Does the "requirement" require a single SQL select? Other rules not stated?

 

Note that the DISTINCT applies to ALL values in the select clause. So you are going to get each combination of the 1/0 values to appear

rdum96
Calcite | Level 5

Ugh, I didn't name my tables correctly in the first post. I've edited now, table b is table_b and table is table_c. I'm not getting the same errors in my log. 

ballardw
Super User

Still think there may be something on your end.

I run this, where I have taken the time to actually provide data sets:

data table_A ;
  input ID;
datalines;
1                
2                
3                
4
;

data table_B ;  
  input  ID   ;
datalines;
2   
4   
9  
; 

data table_C;
  input ID;
datalines;
2
3
8
;

proc sql;
   create table all_ids as select
   distinct coalesce(a.ID,b.ID,c.ID) as all_ID, 
   (case when calculated all_ID=a.ID then 1 else 0 end) as in_A,
   (case when calculated all_ID=b.ID then 1 else 0 end) as in_B,
   (case when calculated all_ID=c.ID then 1 else 0 end) as in_C
   from table_A as a full join table_B as b on a.ID=b.ID full join table_c as c on a.ID=c.ID;
quit;

And get this result:

Obs    all_ID    in_A    in_B    in_C

 1        1        1       0       0
 2        2        1       1       1
 3        3        1       0       1
 4        4        1       1       0
 5        8        0       0       1
 6        9        0       1       0

If you aren't then one (or possibly more) of your data sets is not as you present it.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 584 views
  • 0 likes
  • 2 in conversation