Data step merge has a similar functionality of an outer join, but the result depends on the data. If your relationship between the table keys are 1-M, you will see the same result. In your example, there is a M-M relationship.
This is due to the fact that SQL is acting on columns, and the data step has a more row oriented approach.
SQL tries to combine all id values between the two tables, and then select the matches (that meet the join criteria). The data step merge by will put the two tables beside each other. As long there is a match in the BY group, the data step will go to the next row in both tables until there is no match. So, if you have two rows with id=1 in table a, and three rows with id=1 in table two, you will end up with three rows in your output table. Row three will be data from row two in table a, and row three in table b.
data step merge does not correspond to a cartesian-type (m-n) join.
That type of join be achieved in a data step if that is your preference, by using two set statements and controlling the reads from each table, probably with key= indexed access on at least one of these tables.
I don't need that data step complexity because I think the sql join is easier to recognise, understand and therefore, maintain. Why would you seek it?
MERGE will happily provide 1-N and N-1 joins (but with unexpected effects, in the nature of a data step, when variables are common to both tables but not among the list of by- variables).
It looked to me like you wanted a M-N join, because you have repeats of the by-variable on both data sets.
When by-values repeat on more than one table in a MERGE, the effect is like two diaries side-by-side. Each time a MERGE statement is executed a row of data is taken from each table in the statement until there is no more data for that by-value in any table on the statement. I expect it is better described in the proper documentation.