Solved: Re: proc sql how to exclude variables from being selected

Justin9 · Posted 08-12-2020 01:01 PM

Hi,

I am creating a table using proc sql and was wondering if someone could help me to select all of the variables from dataset B in a left join, except for two variables: "RECORD_NUMBER" (I would need to keep RECORD_NUMBER as I will left join on a.RECORD_NUMBER=b.RECORD_NUMBER, but only dataset A's RECORD_NUMBER would be selected to be shown in the table) and "MONTH". The preferred option would be to just exclude these two variables from dataset B, as there are a fair amount of variables that would need to be listed.

proc sql;
   create table Final_Information as
   select A.*
         ,B.* /*How to keep all of B's variables, except "B.RECORD_NUMBER" and "B.MONTH"?*/
   from AC.TEST as a
   left join INITIAL_DATA as b
         on a.RECORD_NUMBER=b.RECORD_NUMBER
   ;
quit;

Tom · Posted 08-12-2020 01:22 PM

No need to drop b.RECORD_NUMBER because you already have a.RECORD_NUMBER and you cannot have two variables with the same name. To drop MONTH use DROP= dataset option.

INITIAL_DATA(drop=month) as b

If you want to eliminate messages about duplicate variable names then rename RECORD_NUMBER and then add DROP= dataset option to the output dataset.

create table Final_Information(drop=b_record_number) as
   select A.*
         ,B.*    
   from AC.TEST as a
   left join INITIAL_DATA (drop=month rename=(record_number=b_record_number)) as b
         on a.RECORD_NUMBER=b.b_RECORD_NUMBER
;

View solution in original post

ballardw · Posted 08-12-2020 01:16 PM

Formal syntax would be instead of ,b.* to list the variables

,b.variable1, b.variable2, b.variable3 <etc>

The quick and dirty easy data set option Drop on B won't work because one of the variables you want to drop is used in an expression. Do you also have a MONTH variable in the A data set?

Justin9 · Posted 08-12-2020 01:23 PM

Thanks for your reply! I would have typed out the usual b.[variable], b.[variable] etc. but there are a lot of variables in dataset B that I want to join on, so was wondering if there was any code to not select only the "RECORD_NUMBER" and "MONTH" (without dropping it from the INITIAL_DATA dataset). When I write A.*, there is the "MONTH" variable in dataset A already (which is the only one I want to show in the final table) and a warning appears on my log saying that it's already in the table, so that's why the preference would be to simply not select the two variables ("RECORD_NUMBER" and "MONTH"), but not have to list out every variable I want in dataset B (as there are quite a lot).

Reeza · Posted 08-12-2020 01:18 PM

proc sql;
   create table Final_Information as
   select A.*
         ,B.* /*How to keep all of B's variables, except "B.RECORD_NUMBER" and "B.MONTH"?*/
   from AC.TEST as a
   left join INITIAL_DATA (drop=month) as b
         on a.RECORD_NUMBER=b.RECORD_NUMBER
   ;
quit;

I think this would work but may still give you a note in the log.

Tom · Posted 08-12-2020 01:22 PM

No need to drop b.RECORD_NUMBER because you already have a.RECORD_NUMBER and you cannot have two variables with the same name. To drop MONTH use DROP= dataset option.

INITIAL_DATA(drop=month) as b

If you want to eliminate messages about duplicate variable names then rename RECORD_NUMBER and then add DROP= dataset option to the output dataset.

create table Final_Information(drop=b_record_number) as
   select A.*
         ,B.*    
   from AC.TEST as a
   left join INITIAL_DATA (drop=month rename=(record_number=b_record_number)) as b
         on a.RECORD_NUMBER=b.b_RECORD_NUMBER
;

proc sql how to exclude variables from being selected

Re: proc sql how to exclude variables from being selected

Re: proc sql how to exclude variables from being selected

Re: proc sql how to exclude variables from being selected

Re: proc sql how to exclude variables from being selected

Re: proc sql how to exclude variables from being selected

Click image to register for webinar

Classroom Training Available!