Hi all,
I am trying to merge three datasets. I have tried two different ways and they both are giving me incorrect numbers (because each file has around 1 million records, so total should be around 3 millions). One of the way is removing duplicate reference_number and giving me 1.7 million records and other is giving me 4.3 million records. Can you please suggest the correct way to merge these datasets?
/*Merge multiple datasets with reference_number as common value*/
proc sql;
create table Equifax_files as
select * from work.elnz_lowell_11oct
union all
select * from work.elnz_lowell_10feb
union all
select * from work.elnz_lowell_12sep
order by reference_number;
quit;
/* Method 2 */
Data Equifax_files ;
merge work.elnz_lowell_12sep work.elnz_lowell_11oct work.elnz_lowell_10feb;
by reference_number;
run;