Hi, all
I have dataone with variable gvkey which I will use to extract data from another datatwo. Since datatwo is on server, so I plan to use
proc upload data=locallib.dataone out= dataone;run;
my question is how I should write proc sql to get datatwo based on dataone’s gvkey. Below is my code, I want to know
1. If I should use left join
2. If we do not use proc sql, any other better options?
proc sql;
create table WANT as select
A.*, B. gvkey, B.cusip, B.EPSPX
B.CEQ,B.XRD ,B.OANCF,B.FYEAR
From dataone as A left join datatwo as B
on A.gvkey=B.gvkey and A.year=B.year
order by gvkey, year;
quit;
thanks!
Lan
So this question is NOT about PROC UPLOAD?
About left join, how could we know? What is your requirement?
Is the PK gvkey together with year in dataone?
LanMin,
Whether you need to use a LEFT JOIN or not depends on your question ("requirement" as LinusH said). I'd suggest that you read the PROC SQL documentation about left vs inner join and if you still have questions, post with samples of A, B, and WANT for us to better help you.
Be careful with the "*" syntax. Since gvkey is in both tables, you will have two variables with the same name in WANT and SAS will give unexpected results.
It looks like you are using the LIBNAME approach to connecting to the database. The only way to guarantee that the query runs on the database (rather than the SAS server) is to use a pass thru query.
There are non-SQL ways to do this (with a DATA step and MERGE), but all the processing would be done in SAS, so there is no value to the UPLOAD.
Doc Muhlbaier
Duke
thanks to both of you!
proc upload is relevant, because datatwo is stored on the server, I upload dataone to server, in order to use its gvkey as the linking variable to extract data from datatwo and I only want those observations that have the same gvkey as dataone- left join will serve that purpose.
Doc, you are right about * syntax, since gvkey is in both tables, would you suggest I list all variables to be included in final output data?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.