Hi SAS Community,
First off, my apologies if I do not include enough/the correct information - this is my first post on this site. Happy to receive feedback on how I can improve my "question asking".
I have >1million subject IDs - each subject ID is 20 characters - in a CSV file.
I have a proc sql block that pulls from a "master table", but I want to ONLY pull in the rows where the subject ID in the "master table" ALSO exists in the CSV file. It is a single column (see screenshot).
I'd like to use a WHERE statement in the proc sql block to accomplish this, but of course open to other methods.
I have tried a few different methods, such as saving the entire csv as a ".mac" file and assigning it to a macro, but I got an error indicating the macro couldn't hold all the info I was trying to shove into it.
Here is what I have most recently tried:
proc import
file = "blah blah file path"
out = master_patid
dbms = csv
replace;
run;
proc sql;
create table tablename as
select a.subid, a.diag, b.subid, b.drg, b.fst_dt
from source.&dataset. a left join source.&dataset. b
on a.subid=b.subid
where subid in (master_patid));
The error I get in my log file says "Syntax error, expecting one of the following: a quoted string, a numeric constant, a datetime constant,"
My hunch is that it is not seeing the csv file as a list - but I haven't been able to figure out how to do so.
I appreciate you taking the time to help me out!
First of all, you have one opening bracket, but two closing brackets.
Second, use a sub-select:
where subid in (select concat from master_patid)
First of all, you have one opening bracket, but two closing brackets.
Second, use a sub-select:
where subid in (select concat from master_patid)
Please show us the ENTIRE log for this PROC SQL.
In the future, whenever you have errors in the log, we need to see the ENTIRE log for the step (PROC or DATA step) that has the errors. Do not show us errors detached from the code as it appears in the log. Do not show us parts of the log of this step and chop out other parts of the log.
Please copy the log as text and paste it into the window that appears when you click on the </> icon.
No need for PROC IMPROT to read a simple text file, especially one with only one variable. Use FIRSTOBS=2 to skip the extra line at the top in your photograph of the data.
data master_pid;
infile "blah blah file path" dsd firstobs=2 truncover;
input patid :$20. ;
run;
Are the quotes actually in the file? Should the quotes actually be part of the values to be found? If not then using the DSD option on INFILE statement will remove the quotes.
Is the issue just to subset some existing dataset? So if there is an existing dataset named HAVE that also has a variable named PATID the you want to subset then just do an INNER JOIN to find the observations that are in both.
proc sql;
create table WANT as
select a.*
from HAVE a
inner join MASTER_PID b
on a.patid = b.patid
;
quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.