Hi All,
I am trying to figure out the best way to fill in blanks/update data in one data table with data form a column in another data table. I have tried the hash/define method, but I couldn't get it to work. So now I'm trying this:
proc sql;
select coalesce (MOBILENO, MOBILE) as MOST_RECENT_MN
from AP;
quit;
I dont really know how to use this function and the doco is over my head.
Basically I would like to select the MOBILENO (when it exists), which is in the AP table and have that overwrite the MOBILE column which is in the M1 table. Where the MOBILENO does not exist then the MOBILE column is fine.
Greatly appreciate any help or direction.
@2222 wrote:
thank you, and does this work if the variable comes from the same table as the variable it is updating? then I drop the variable after i have used it to update the variable that had missing values?
In that case there is no need for SQL at all.
data want;
set AP;
most_recent_mn = coalesceC(MOBILENO, MOBILE);
drop MOBILENO;
run;
Unless you are doing something extremely complex with extremely large datasets it is much better to create a NEW dataset when making changes to the data with code instead of making changes to an existing dataset. That way you can always correct any problems and re-run the code without having to worry that the mistakes have destroyed the source data.
You are on the right track but you've only listed one table in your query. It should look more like this:
proc sql;
create table Want as
select coalesce (MOBILENO, MOBILE) as MOST_RECENT_MN
from AP
left join M1
on AP.join_column = M1.join_column
;
quit;
I don't know what the joining rules are for your tables so you need to supply that. Putting MOBILENO first in the COALESCE means what will be used if populated, otherwise MOBILE is used.
thank you for that, unfortunately I am getting an ambiguous reference error because MOBILE is in two datasets
In SQL, you need to be clear about where a variable is supposed to come from; define table aliases in the FROM clause and use them when addressing variables.
e.g.
proc sql;
create table want as
select
t1.id,
t1.var_a,
(t1.var_a - t2.var_b) as diff
from table_a t1 left join table_b t2
on t1.id = t2.id
;
quit;
@2222 wrote:
thank you, and does this work if the variable comes from the same table as the variable it is updating? then I drop the variable after i have used it to update the variable that had missing values?
In that case there is no need for SQL at all.
data want;
set AP;
most_recent_mn = coalesceC(MOBILENO, MOBILE);
drop MOBILENO;
run;
Unless you are doing something extremely complex with extremely large datasets it is much better to create a NEW dataset when making changes to the data with code instead of making changes to an existing dataset. That way you can always correct any problems and re-run the code without having to worry that the mistakes have destroyed the source data.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.