Hello. I have broad question about working with large datasets in SAS. In the past, my work has used datasets that were relatively small in size. My preferred way of working with that data is to create a temporary dataset in the work folder, work through a session/program with multiple temporary datasets, and then save a new permanent from the last temp dataset at the end of a session/program. For example: data tempwork; set permlib.originaldata; run; ... SAS program code... data permlib.originaldata_v1_032919; set tempwork20; run; Each time that a new work session is started, I make a temporary dataset from the most recent permanent one, and begin the process over (e.g. data tempwork; set permlib.originaldata_v1_032919..... data permlib.originaldata_v2_04202019.... run;). This ends up with multiple permanent datasets. Maybe this is not "correct" or efficient, but I like working with multiple temp datasets when working toward building my final dataset. I am starting a project with a datasets that totals about 600 GB. My former process is not possible, because having multiple datasets with such big flies is not an option. My idea was to copy over the same permanent dataset repeatedly. For example, instead of making a _v2, just copy back over _v1. This is not working because it replaces the historical datatset, thus I cannot rerun code without going all the way back to the beginning program and original dataset. I'm not sure how to proceed with this data using my preferred process. Any suggestions for how to keep (or revert to) historical versions of datasets without taking up too much additional hard drive space? Could PROC DATASETS be used here. Any advice, with code, would be much appreciated.
... View more