BookmarkSubscribeRSS Feed
LanMin
Fluorite | Level 6

Hi, all

I have dataone with variable gvkey which I will use to extract data from another datatwo. Since datatwo is on server, so I plan to use

proc upload data=locallib.dataone out= dataone;run;

my question is how I should write proc sql to get datatwo based on  dataone’s gvkey. Below is my code, I want to know

1. If I should use left join

2. If we do not use proc sql, any other better options?

proc sql;

create table WANT as select

A.*, B. gvkey, B.cusip, B.EPSPX

B.CEQ,B.XRD ,B.OANCF,B.FYEAR

From dataone as A left join datatwo as B

on A.gvkey=B.gvkey and A.year=B.year

order by gvkey, year;

quit;

thanks!

Lan

3 REPLIES 3
LinusH
Tourmaline | Level 20

So this question is NOT about PROC UPLOAD?

About left join, how could we know? What is your requirement?

Is the PK gvkey together with year in dataone?

Data never sleeps
Doc_Duke
Rhodochrosite | Level 12

LanMin,

Whether you need to use a LEFT JOIN or not depends on your question ("requirement" as LinusH said).  I'd suggest that you read the PROC SQL documentation about left vs inner join and if you still have questions, post with samples of A, B, and WANT for us to better help you.

Be careful with the "*" syntax.  Since gvkey is in both tables, you will have two variables with the same name in WANT and SAS will give unexpected results.

It looks like you are using the LIBNAME approach to connecting to the database.  The only way to guarantee that the query runs on the database (rather than the SAS server) is to use a pass thru query.

There are non-SQL ways to do this (with a DATA step and MERGE), but all the processing would be done in SAS, so there is no value to the UPLOAD.

Doc Muhlbaier

Duke

LanMin
Fluorite | Level 6

thanks to both of you!

proc upload is relevant, because datatwo is stored on the server, I upload dataone to server, in order to use its gvkey as the linking variable to extract data from datatwo and I only want those observations that have the same gvkey as dataone- left join  will serve that purpose.

Doc, you are right about * syntax, since gvkey is in both tables, would you suggest I list all variables to be included in final output data?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 863 views
  • 0 likes
  • 3 in conversation