I am working with many large datasets. Each dataset has millions of observations, and they range in size from 1GB to more than 100GB. I need generate for each variable the number of levels, missing levels, non-missing levels, number of observations, and the ratio of levels to observations. I am using the following code. However, SAS frequently reports insufficient memory and stops processing. I would like to learn a more efficient approach to producing the variable descriptions without running out of memory. Any suggestions would be appreciated. proc freq nlevels data= mydata ; ods output nlevels=nlevels; tables _all_ / noprint ; run; data want ; if 0 then set mydata (drop=_all_) nobs=nobs ; set nlevels; total=nobs; unique_ratio = nlevels/total ; run; proc print; run;
... View more