I suspect that your Main Cell campaign with the Split has a diagram property that says "Use the Most current date when referenced by a Link."
I'm guessing your Main Cell campaign includes a communication that Uses Data 5 as Marketing Cell and that the Duplicates you're finding are across the two different campaigns. Assuming the Data 20 campaign executes after the Main Cell and the Main cell has the property shown below checked, the random split will be done again and thus create overlap with 1st split execution for Main Cell.

Your Data 20 Campaign shows no overlap between the membership in the two link nodes at the time of execution since the AND node of Data 20 with Everyone not in Data5 (exclusion link) results in same count as Data20 Link node. :(if there was overlap in the two Link nodes when Data 20 was run, The Exclusion link for Data 5 would include some customers not in Data 20 and thus the And Node would have a lower count than Data 20.
Why don't you include communications for both Data20 and Data5 in the same Main Cell campaign?
Hope this makes sense.