Hello ,
I would appreciate if someone can advise the difference in the 2 merging techniques below:
I recieve different counts using the data step merge vs the proc sql and i need to understand why?
Technique 1 
proc sql;
create table NACRSFY19_int_AT as
select distinct *
from NACRSFY19_int as a inner join nacrs19.Anaesthetic_technique as b
on a.am_care_key=b.am_care_key;
quit;
/*5,829,296*/
Technique 2
proc sort data= NACRSFY19_int;
by am_care_key;
run;
proc sort data= nacrs19.Anaesthetic_technique out=AT_sorted;
by am_care_key;
run;
/*16,943,476*/
data nacrs_AT_19;
merge NACRSFY19_int (in=a) AT_sorted (in=b);
by am_care_key;
if a and b;
run;
/*3,237,261*/Hi @Baba9,
Have you checked the log to see if that gives any clues.
If there is anything in the log you need help to understand then please post its contents using the "</>" icon when posting.
Thanks & kind regards,
Amir.
 proc sql;
633   create table NACRSFY19_int_AT as
634   select distinct *
635   from NACRSFY19_int as a inner join nacrs19.Anaesthetic_technique as b
636   on a.am_care_key=b.am_care_key;
WARNING: Variable AM_CARE_KEY already exists on file WORK.NACRSFY19_INT_AT.
NOTE: SAS threaded sort was used.
NOTE: Table WORK.NACRSFY19_INT_AT created, with 5829296 rows and 10 columns.
637   quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           5.57 seconds
      user cpu time       4.96 seconds
      system cpu time     2.89 seconds
      memory              968258.75k
      OS Memory           999016.00k
      Timestamp           2021-07-05 12:33:11 PM
      Step Count                        102  Switch Count  0
638   data NACRSFY19_int2 ;
639   merge nacrs_AT_19 (in=a) Excl_combined (in=b);
640   by am_care_key;
641   if a and not b;
642   run;
INFO: The variable CCI_INTERV_CODE on data set WORK.NACRS_AT_19 will be overwritten by data set
      WORK.EXCL_COMBINED.
INFO: The variable INTERV_SEQ_ID on data set WORK.NACRS_AT_19 will be overwritten by data set
      WORK.EXCL_COMBINED.
INFO: The variable STATUS_ATTRIBUTE_CODE on data set WORK.NACRS_AT_19 will be overwritten by data
      set WORK.EXCL_COMBINED.
INFO: The variable INTERV_OOH_IND_CODE on data set WORK.NACRS_AT_19 will be overwritten by data
      set WORK.EXCL_COMBINED.
NOTE: There were 3237261 observations read from the data set WORK.NACRS_AT_19.
NOTE: There were 37200 observations read from the data set WORK.EXCL_COMBINED.
NOTE: The data set WORK.NACRSFY19_INT2 has 3205688 observations and 21 variables.
NOTE: DATA statement used (Total process time):
      real time           1.34 seconds
      user cpu time       0.92 seconds
      system cpu time     0.28 seconds
      memory              1080.90k
      OS Memory           33016.00k
      Timestamp           2021-07-05 12:36:14 PM
      Step Count                        103  Switch Count  0
Hi @ pasted the log 
ThankyouYour log confirms that you are doing a many-to-many merge which SQL can do but a DATA step merge cannot. Refer to @Reeza 's explanation.
These are two different joins.
NACRSFY19_int, nacrs19.Anaesthetic_technique
in the SQL,
nacrs_AT_19, Excl_combined
in the data step.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
