BookmarkSubscribeRSS Feed
ericv51389
Calcite | Level 5

Hello everyone,

I am new to SAS for my doctorate program and using proc sql, I am needing to merge the following data sets:

  • ep_provider_paid_ehr
  • National_downloadable_file
  • zip_to_cbsa

All three are csv files that have been imported into SAS and the file names changed to the respective names above in my libraries -> work from import.

 

This is the code that has been given, but it keeps erroring out and I cannot figure out why after multiple attempts, google, youtube, etc:

 

proc sql;
SELECT DISTINCT

national_downloadable_file.group_practice_PAC_id,

national_downloadable_file.number_of_group_practice_members,

ep_provider_paid_ehr.provider_npi,

zip_to_cbsa.cbsa

FROM

national_downloadable_file

JOIN

provider_intersect ON

national_downloadable_file.national_downloadable_file_id =

provider_intersect.national_downloadable_file_id

LEFT JOIN

ep_provider_paid_ehr ON

provider_intersect.ep_provider_paid_ehr_id =

ep_provider_paid_ehr.ep_provider_paid_ehr_id

JOIN

zipcode_table ON

national_downloadable_file.zipcode =

zipcode_table.zipcode

LEFT JOIN

zip_to_cbsa ON

zipcode_table.zipcode = zip_to_cbsa.zipcode

WHERE national_downloadable_file.group_practice_PAC_id<>'NA';

 

Here is the error that keeps occurring:

ERROR: File WORK.PROVIDER_INTERSECT.DATA does not exist.
ERROR: File WORK.ZIPCODE_TABLE.DATA does not exist.
 
Any help would be greatly appreciated.
Thanks!
3 REPLIES 3
Tom
Super User Tom
Super User

It does not look like the names in your code match the names at the start of your posting.

 

For example provider_intersect is not in the list 

 

  • ep_provider_paid_ehr
  • National_downloadable_file
  • zip_to_cbsa

you provided.

 

Also if you imported the CSV files into WORK datasets and then ended that SAS session and started a new one the WORK datasets will be gone.  WORK files are removed when the SAS session ends.

 

You will need to re-import them again before you can use them in SQL code.

ericv51389
Calcite | Level 5

That is what I was thinking and we were not provided with another data base and there are no columns in the ones mentioned that are provider-intersect.

 

Will try playing around with the code some more and see if I can figure out what is going on with all information I was provided.

 

Have also reached out to the professor, but have not heard anything back yet.

ballardw
Super User

A couple of suggestions: first peculiar to this forum software, you should paste code, Log or text data examples, into a text box opened on the forum using the </> icon that appears above the message window. The forum main message windows reformat pasted text and may result in removing, replacing or inserting characters such that when pasted into a code editor will not run correctly. See the box below.

 

Second is an SQL "trick" to reduce typing AND make code a little more flexible: table alias. You can provide a much shorter name to reference your tables as shown below with the <dataset name> as <alias>. Then you don't have to type long data set names constantly. Also you can test/run code with a different data set and only have to change the name in one place where the Alias is set. I highlighted one example for the first data set From. The "as" is actually optional in many SQL implementations but I like to see it so it is clear what T1 (a semi-traditional alias for "Table One"). I could replace the "national_downloadable_file" with "testset1" for example and no other code would need to change. This is very useful for testing a small set of values (possibly of particular interest for some reason) or just to verify that syntax works on a small set before turning it loose on larger data.

proc sql;
   SELECT DISTINCT
      T1.group_practice_PAC_id,
      T1.number_of_group_practice_members,
      T3.provider_npi,
      T5.cbsa
   FROM
      national_downloadable_file as T1
   JOIN
      provider_intersect as T2 ON
      T1.national_downloadable_file_id =
      T2.national_downloadable_file_id
   LEFT JOIN
      ep_provider_paid_ehr as T3 ON
      T2.ep_provider_paid_ehr_id =
      T3.ep_provider_paid_ehr_id
   JOIN
      zipcode_table as T4 ON
      T1.zipcode =
      T4.zipcode
   LEFT JOIN
      zip_to_cbsa as T5 ON
      T4.zipcode = T5.zipcode
   WHERE T1.group_practice_PAC_id<>'NA'
   ;
quit;

As a side-effect of setting aliases in your code it become obvious that your code uses 5 separate tables and you apparently only have 3. So anything using those extra tables would have to "go away" OR provide the data sets.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 344 views
  • 0 likes
  • 3 in conversation