Hi SAS users,
I have a baseline SAS dataset ENC with a primary key ID. I have a second dataset DIAG with ID as a foreign key. For values of ID that do not link, I need to set ID to NULL in the DIAG dataset. Assume there is no way to fix the source data.
I tried about 20 variations of the proc sql step below but nothing works yet.
proc sql; create table DIAG_NEW as select field1 ,field2 ,case when ID in (select ID from ENC) then ID else ID is NULL end as ID from DIAG; quit;
Don't have any data to test this with but something like this should work. When there is no matching row in ENC ID_ENC will be blank.
proc sql; create table DIAG_NEW as select field1 ,field2 ,E.ID as ID_ENC from DIAG as D left join ENC as E on D.ID = E.ID; quit;
Saturday - last edited Saturday
To represent a missing numeric value use a period. For a missing character value then just use a string literal that only contains blanks.
proc sql ; select name ,age,case when name='Alice' then . else age end as age2 ,sex,case when name='Alfred' then ' ' else sex end as sex2 from sashelp.class where name in ('Alice','Alfred') ; quit;
It is much easier if you just use regular SAS syntax instead of trying to use PROC SQL. Then you can use the CALL MISSING() function to make the value missing without having to know whether the field is numeric or character.
data DIAG_NEW merge diag(in=in1) ENC(keep=id in=in2) ; by id; if in1; if not in2 then call missing(id); run;
It's a good habit to avoid embedded select clauses as they are inefficient (though not a performance killer when just building a list as you are here).
data DIAG ENC; do ID=1 to 2e7; output; end; run; proc sql; * 40 seconds; create table DIAG_NEW as select case when ID in (select ID from ENC) then ID else . end as ID from DIAG; quit; proc sql; * 16 seconds; create table DIAG_NEW as select E.ID as ID_ENC from DIAG as D left join ENC as E on D.ID = E.ID; quit;