- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am creating a table using proc sql and was wondering if someone could help me to select all of the variables from dataset B in a left join, except for two variables: "RECORD_NUMBER" (I would need to keep RECORD_NUMBER as I will left join on a.RECORD_NUMBER=b.RECORD_NUMBER, but only dataset A's RECORD_NUMBER would be selected to be shown in the table) and "MONTH". The preferred option would be to just exclude these two variables from dataset B, as there are a fair amount of variables that would need to be listed.
proc sql;
create table Final_Information as
select A.*
,B.* /*How to keep all of B's variables, except "B.RECORD_NUMBER" and "B.MONTH"?*/
from AC.TEST as a
left join INITIAL_DATA as b
on a.RECORD_NUMBER=b.RECORD_NUMBER
;
quit;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No need to drop b.RECORD_NUMBER because you already have a.RECORD_NUMBER and you cannot have two variables with the same name. To drop MONTH use DROP= dataset option.
INITIAL_DATA(drop=month) as b
If you want to eliminate messages about duplicate variable names then rename RECORD_NUMBER and then add DROP= dataset option to the output dataset.
create table Final_Information(drop=b_record_number) as
select A.*
,B.*
from AC.TEST as a
left join INITIAL_DATA (drop=month rename=(record_number=b_record_number)) as b
on a.RECORD_NUMBER=b.b_RECORD_NUMBER
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Formal syntax would be instead of ,b.* to list the variables
,b.variable1, b.variable2, b.variable3 <etc>
The quick and dirty easy data set option Drop on B won't work because one of the variables you want to drop is used in an expression. Do you also have a MONTH variable in the A data set?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply! I would have typed out the usual b.[variable], b.[variable] etc. but there are a lot of variables in dataset B that I want to join on, so was wondering if there was any code to not select only the "RECORD_NUMBER" and "MONTH" (without dropping it from the INITIAL_DATA dataset). When I write A.*, there is the "MONTH" variable in dataset A already (which is the only one I want to show in the final table) and a warning appears on my log saying that it's already in the table, so that's why the preference would be to simply not select the two variables ("RECORD_NUMBER" and "MONTH"), but not have to list out every variable I want in dataset B (as there are quite a lot).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sql;
create table Final_Information as
select A.*
,B.* /*How to keep all of B's variables, except "B.RECORD_NUMBER" and "B.MONTH"?*/
from AC.TEST as a
left join INITIAL_DATA (drop=month) as b
on a.RECORD_NUMBER=b.RECORD_NUMBER
;
quit;
I think this would work but may still give you a note in the log.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No need to drop b.RECORD_NUMBER because you already have a.RECORD_NUMBER and you cannot have two variables with the same name. To drop MONTH use DROP= dataset option.
INITIAL_DATA(drop=month) as b
If you want to eliminate messages about duplicate variable names then rename RECORD_NUMBER and then add DROP= dataset option to the output dataset.
create table Final_Information(drop=b_record_number) as
select A.*
,B.*
from AC.TEST as a
left join INITIAL_DATA (drop=month rename=(record_number=b_record_number)) as b
on a.RECORD_NUMBER=b.b_RECORD_NUMBER
;