Hi Experts,
I have imported a file with account number and name. Now I want to add remaining information from an existing dataset. So I have created a libname where the dataset is stored but I am not sure how to join both the data.
I am doing as the code mentioned but it is only giving information about file 1 which just have the account number and name. Can you please suggest what changes do I need to make in my code. Thank you.
libname SS '\\local\shares\SASWork\SASUSERS\SSD';
Proc sql;
create table Adding_info as
select a.*
from work.file_1 as a
left join SSD.Main_Data as b on a.accountnumber=b.accountnumber;
quit;
You asked for ALL of the variables from FILE_1 (turns out there were only two variables).
You also asked to include ACCOUNTNUMBER from SSD.MAIN_DATA, but since that name was already used (obviously since otherwise your ON expression would be invalid) it is ignored.
What did you want it to do instead.
If you want all of the variables from SSD.MAIN_DATA and just the other variable from FILE_1 then specify that. So if the second variable is named VAR2 then the change the variable list part to:
b.*,a.var2
The "Select a.*" means "write all the variables for the data set with the alias A". So your Select clause did not include any of the variables from the set with the alias of B. You would change your select to something like
select a.*, b.thisvar, b.thatvar, b.othervarname
to explicitly list the variables from B that you want. Since you did not provide any details obviously the names above are fictitious and only serve as a pattern to follow.
Thank you for the information. I did changed that and it gives me the same name and account number information. I feel that I am also doing something wrong with the inner join SS.Main_data as it is not joining. Can you please check the inner join statement and if possible let me know why it is not joining with imported file data.
@Sandeep77 wrote:
Actually, I tried with left join and inner join both to see if that works but none worked.
Define it more detail what "none worked" means.
Did the code not run? Show the log and the notes/error messages.
Did you just not find any matches when joining on (a.accountnumber=b.accountnumber)? Then show example account numbers from both datasets that you think are the same but SAS does not.
Is accountnumber a character variable (normally account numbers should be strings since you do not do arithmetic on them) or a numeric variable?
If character then make sure the letters are using the same case. And there are no leading spaces. Are there other invisible characters? Like carriage return or line feed or tab or non-breaking space? em-dash or en-dash instead of hyphen?
If numeric make sure the precision is the same. You can only store about 15 decimal digits as a floating point number so any really large numbers need to be stored as strings and not numbers.
Yes, the code runs successfully but it shows just the name and account number. I want to add all the other data which is in the Main_data dataset. Both the account numbers are characters and I have used same case for both of them. Please check the log.
27 ;
NOTE: Writing HTML5(EGHTML) Body file: EGHTML
28
29 proc sql;
30 create table Adding_info as
31 select
32 a.*,
33 b.accountnumber
34
35 from work.file_1 as a
36 left join SSD.Main_Data as b on a.accountnumber=b.accountnumber;
WARNING: Variable accountnumber already exists on file WORK.File_1.
NOTE: Compressing data set WORK.File_1 increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: Table WORK.File_1 created, with 1078 rows and 2 columns.
37 quit;
NOTE: PROCEDURE SQL used (Total process time):
You asked for ALL of the variables from FILE_1 (turns out there were only two variables).
You also asked to include ACCOUNTNUMBER from SSD.MAIN_DATA, but since that name was already used (obviously since otherwise your ON expression would be invalid) it is ignored.
What did you want it to do instead.
If you want all of the variables from SSD.MAIN_DATA and just the other variable from FILE_1 then specify that. So if the second variable is named VAR2 then the change the variable list part to:
b.*,a.var2
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.