I have 2 datasets that have the SAME variables, but some of those variables may have values and some may not. The key variable between them is billable_id (it always has data in it). I had thought of doing something like:
PROC SQL;
CREATE TABLE EGTASK.TEST AS
SELECT
(case when t1.billable_id is missing then t2.billable_id else t1.billable_id end) AS billable_id,
(case when t1.ac_type is missing then t2.ac_type else t1.ac_type end) AS ac_type
FROM EGTASK.QUERY_FOR_Q_FOR_BILLING_DAT_000E t1
FULL JOIN EGTASK.QUERY_FOR_Q_FOR_BILLING_DAT_000F t2 ON (t1.billable_id = t2.billable_id);
QUIT;
But for many variables this case when approach will take time. Is there a way to tell SAS to do this for all variables as opposed to typing everything out and risking typos?
Thanks
You can use the COALESCE (numeric values) or COALESCE if you know which you prefer when both are present.
The Coalesce function selects the first non-missing value as read from a list of variables from left to right
If the billiable_id is character that would look like:
coalescec(t1.billable_id.t2.billable_id) AS billable_id,
if the t1 value is missing than the t2 value, if present, is used.
I assumed that your ID is character, if not use Coalesce.
Nice thing is this can use more than 2 variables if needed but applies the values in the order you prefer if multiple values are available.
I prefer what @Astounding proposes but just for "completeness":
You can use the coalesce() function in SQL for both numerical and character variables.
If you've got a lot of variables then you could generate the SELECT clause using the dictionary tables. It complicates the code so I'd only not just type but generate the select clause if it saves a lot of typing.
Below code demonstrates how code generation could look like. Below also works if not all variables exist in both source data sets.
/* create sample data */
data have1;
otherVar_h1_1=_n_;
set sashelp.class;
if _n_=2 then call missing(sex);
if _n_=3 then call missing(age, height);
otherVar_h1_2=_n_;
run;
data have2;
otherVar_h2_1=_n_;
set sashelp.class;
if _n_=2 then call missing(age, height);
if _n_=3 then call missing(sex);
otherVar_h2_2=_n_;
run;
/* genereate Select clause using coalesce() */
proc sql noprint;
select
case
when missing(h1.name) then h2.name
when missing(h2.name) then h1.name
else
catx( ' ',
cats('coalesce(',h1.memname,'.',h1.name,',',h2.memname,'.',h2.name,') as'),
h1.name,'length=',put(max(h1.length,h2.length),5.)
)
end
into :SelectClause separated by ','
from
(
select name, varnum, memname, length
from dictionary.columns
where libname='WORK' and memname='HAVE1'
) h1
full outer join
(
select name, varnum, memname, length
from dictionary.columns
where libname='WORK' and memname='HAVE2'
) h2
on upcase(h1.name)=upcase(h2.name)
order by coalesce(h1.varnum, h2.varnum)
;
quit;
/*%put &=SelectClause;*/
/* combine tables */
proc sql feedback;
select
&SelectClause
from
have1 t1 full outer join have2 t2
on t1.name=t2.name
;
quit;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.