And as another update:
When running the same script against source data with the same number of rows and variables but less duplicates then the elapsed run-time for deduplication.deduplicate remains the same but the run-time for simple.groupBy deteriorates significantly.
... View more