BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Sandeep77
Lapis Lazuli | Level 10

Hi Experts,

I have imported a file with account number and name. Now I want to add remaining information from an existing dataset. So I have created a libname where the dataset is stored but I am not sure how to join both the data.

I am doing as the code mentioned but it is only giving information about file 1 which just have the account number and name. Can you please suggest what changes do I need to make in my code. Thank you.

libname SS '\\local\shares\SASWork\SASUSERS\SSD';

Proc sql;
create table Adding_info as
select a.*
from work.file_1 as a
left join SSD.Main_Data as b on a.accountnumber=b.accountnumber;
quit;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You asked for ALL of the variables from FILE_1 (turns out there were only two variables).

You also asked to include ACCOUNTNUMBER from SSD.MAIN_DATA, but since that name was already used (obviously since otherwise your ON expression would be invalid) it is ignored.

 

What did you want it to do instead.

 

If you want all of the variables from SSD.MAIN_DATA and just the other variable from FILE_1 then specify that.  So if the second variable is named VAR2 then the change the variable list part to:

b.*,a.var2

View solution in original post

10 REPLIES 10
ballardw
Super User

The "Select a.*" means "write all the variables for the data set with the alias A". So your Select clause did not include any of the variables from the set with the alias of B. You would change your select to something like

 

select a.*, b.thisvar, b.thatvar, b.othervarname

to explicitly list the variables from B that you want. Since you did not provide any details obviously the names above are fictitious and only serve as a pattern to follow.

Sandeep77
Lapis Lazuli | Level 10

Thank you for the information. I did changed that and it gives me the same name and account number information. I feel that I am also doing something wrong with the inner join SS.Main_data as it is not joining. Can you please check the inner join statement and if possible let me know why it is not joining with imported file data.

Reeza
Super User
Your code says left join not inner join.

Sandeep77
Lapis Lazuli | Level 10
Actually, I tried with left join and inner join both to see if that works but none worked.
Reeza
Super User
Define not worked.
Reeza
Super User
Your code selects nothing from the b table for starters.
Sandeep77
Lapis Lazuli | Level 10
I did added account number from b table. e.g
Proc sql;
create table Adding_info as
select a.*,
b.accountnumber
from work.file_1 as a
left join SSD.Main_Data as b on a.accountnumber=b.accountnumber;
quit;
but it's giving me just the name and account number information and nothing from the main data.
Tom
Super User Tom
Super User

@Sandeep77 wrote:
Actually, I tried with left join and inner join both to see if that works but none worked.

Define it more detail what "none worked" means.

Did the code not run?  Show the log and the notes/error messages.

Did you just not find any matches when joining on (a.accountnumber=b.accountnumber)?  Then show example account numbers from both datasets that you think are the same but SAS does not.  

Is accountnumber a character variable (normally account numbers should be strings since you do not do arithmetic on them) or a numeric variable?

If character then make sure the letters are using the same case.  And there are no leading spaces.  Are there other invisible characters?  Like carriage return or line feed or tab or non-breaking space?  em-dash or en-dash instead of hyphen?

If numeric make sure the precision is the same.  You can only store about 15 decimal digits as a floating point number so any really large numbers need to be stored as strings and not numbers.

Sandeep77
Lapis Lazuli | Level 10

Yes, the code runs successfully but it shows just the name and account number. I want to add all the other data which is in the Main_data dataset. Both the account numbers are characters and I have used same case for both of them. Please check the log.

27         ;
NOTE: Writing HTML5(EGHTML) Body file: EGHTML
28         
29         proc sql;
30         create table Adding_info as
31         select
32         a.*,
33         b.accountnumber
34         
35         from work.file_1 as a
36         left join SSD.Main_Data as b on a.accountnumber=b.accountnumber;
WARNING: Variable accountnumber already exists on file WORK.File_1.
NOTE: Compressing data set WORK.File_1 increased size by 100.00 percent. 
      Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: Table WORK.File_1 created, with 1078 rows and 2 columns.

37         quit;
NOTE: PROCEDURE SQL used (Total process time):
Tom
Super User Tom
Super User

You asked for ALL of the variables from FILE_1 (turns out there were only two variables).

You also asked to include ACCOUNTNUMBER from SSD.MAIN_DATA, but since that name was already used (obviously since otherwise your ON expression would be invalid) it is ignored.

 

What did you want it to do instead.

 

If you want all of the variables from SSD.MAIN_DATA and just the other variable from FILE_1 then specify that.  So if the second variable is named VAR2 then the change the variable list part to:

b.*,a.var2

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1511 views
  • 1 like
  • 4 in conversation