In Proc Summary, is there any way to set the length of variables in the output out statement? I'm trying to avoid having to write a separate data step to set the variable lengths.
Hi @johngds,
You can use the KEEPLEN option of the OUTPUT statement of PROC SUMMARY to let the statistics inherit the lengths of the analysis variables. If this is not applicable to your analysis dataset (because the variables in question have full length there), you can apply it to a DATA step view quickly created from the dataset using an appropriate LENGTH statement. (Edit: You may want to set the VARLENCHK= system option to NOWARN in this case.)
What do you want to gain by it? It is best to let the numeric variables keep the default maximum length of 8 (because of precision).
Setting variable lengths will enable me to cut the size of the file down from 4.5 GB to about 2.5 GB. I'll still be able to maintain precision.
@Kurt_Bremser wrote:What do you want to gain by it? It is best to let the numeric variables keep the default maximum length of 8 (because of precision).
You might be better off implementing your stats in a data step in the first place. Could you show the code for the summary procedure?
@johngds wrote:
Setting variable lengths will enable me to cut the size of the file down from 4.5 GB to about 2.5 GB. I'll still be able to maintain precision.
@Kurt_Bremser wrote:
What do you want to gain by it? It is best to let the numeric variables keep the default maximum length of 8 (because of precision).
Is that with or without compression?
Hi @johngds,
You can use the KEEPLEN option of the OUTPUT statement of PROC SUMMARY to let the statistics inherit the lengths of the analysis variables. If this is not applicable to your analysis dataset (because the variables in question have full length there), you can apply it to a DATA step view quickly created from the dataset using an appropriate LENGTH statement. (Edit: You may want to set the VARLENCHK= system option to NOWARN in this case.)
Thanks to all who responded. I was not aware of the KEEPLEN option for the OUTPUT OUT= statement. Clean and easy, and it saves an additional data step. The variable I am limiting the length to 4 is an integer and takes a value between 0 and 20,000.
@johngds wrote:
In Proc Summary, is there any way to set the length of variables in the output out statement? I'm trying to avoid having to write a separate data step to set the variable lengths.
Which variables? What statistics are you generating? It might make sense to truncate the storage of integer values, but the mean, standard deviation and variance of integer is going to be a floating point value. Those variables you won't want to truncate the storage.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.