I'm merging 2 datasets
count = 6,437,567 variables = 22 file size = 1.2 GB page size = 65,536 # pages = 18,566
count = 2,276,587 variables = 917 file size = 2.4 GB page size = 131,072 # pages = 19,726
The resulting dataset has:
count = 6,437,567 variables = 924 file size = 21.8 GB page size = 131,072 # pages = 170,379
I didn't expect to get 22 GB when merging 1GB with 2GB
This is a straight merge without any calculations (dropping/merging a few fields)
I get the same results with a data step merge as with SQL (left join)
None of the character fields are excessively long (the longest is 40 and only a few are that long)
All 3 datasets use CHAR compression; no indexes yet
Dataset option REUSE = YES reduced the output from 23 GB to 22 GB
I'm using SAS-EG 9.0401M5 in Linux
Does it really make sense for the output dataset to be 22 GB ?
... View more