I know that a LOT of folks do this and if you're a programmer who NEVER makes mistakes or logic errors, then this is an OK thing to do. But, I've been programming in SAS for over 20 years and I never do this. Here's why:
1) I used to work for lawyers and one thing they never wanted was for their statistical expert to be on the stand and get asked about why the same named file was being used for both INPUT and OUTPUT.
2) if there's any logic error in the code, you have just overwritten the original file with possible errors. Unless you have a backup of the original WORK.TEMP or INPUT_FILE, you are in trouble recreating or fixing the problem you just introduced.
3) it is harder to explain to beginners what is going on and therefore, only more experienced SAS programmers can maintain the program.
With the code you show, simple as it is, is it probably OK to have
Sure, but the minute you get tempted to add more logic or more data manipulation to the program, I'd do something different. In fact, there's NO reason why you couldn't set the FLAG variable when you read in the INPUT_FILE, like this:
if sal < 1000 then flag = 1;
Understanding a bit more about how SAS data steps operate and how to make your programs more efficient would be a good learning exercise. There are a LOT of user group papers and documentation tutorials on how SAS works.
This particular program is rather inefficient
Data temp; /* creates work.temp from work.input_file */
Set Temp; /* REREADS work.temp a second time */
if sal < 1000 then flag=1;
because you are creating WORK.TEMP in the first step and then REREADING WORK.TEMP in order to set the FLAG variable. Probably not a big deal efficiency wise if you only have a few hundred observations,
but if you have hundreds of thousands or millions of obs, not a good idea.
I know that other folks have differing opinions about the construction you're using. And I'm sure you'll hear about them all!
Contrary to Cynthia, I routinely reuse dataset names in the WORK library. They are going to disappear at the end of the batch job or interactive session anyway, so I find the risk of data loss to be negligible.
We also work with datasets that have millions of rows in them, so it is easy to run out of disk space if one is not careful in its management. Reusing the dataset name is generally easier than inserting a lot of PROC DATASETS to explicitly delete the unneeded files.
In agreement with Cynthia, I do NOT write over permanent library files unless I think about it and make it a conscious decision. Usually that occurs in "freshening" an analysis file from a operational store (e.g. Oracle).
I'll agree with Cynthia for two principal reasons. If you're importing large datasets (e.g., a file that takes around 2 or more hours to import .. but only minutes to analyze), why risk having to repeat the initial time consuming part.
And second, with permanent files, some shops depend on the files' date/time stamps. Re-saving the files changes those dates and times and can cause everyone a lot of unnecessary aggrevation.
As for Doc's concern, you can always delete files that are no longer needed.