BookmarkSubscribeRSS Feed
attjooo
Calcite | Level 5

In general, are there any dangers with expressions as below?

data tableA;

set tableA;

.

<code>

.

run;

I prefer to make various manipulations with a data set in several steps, but I want to avoid creating unnecessary data sets in the work library..That is the background to my question.

6 REPLIES 6
Ksharp
Super User

No.not any danger, just override original dataset. When you do it, actually sas will create a temporary dataset at the same fold and at the end sas will rename it to your original dataset .

Kurt_Bremser
Super User

In addition to Ksharp:

If SAS encounters an error (or already has an error condition set if you are in batch mode), the data set will not be replaced.

But:

If you test-run such code, you will be forced to rerun it from the start each time you need to control an intermediate stage of the data set. That's why I let all my data sets in a batch job have unique names.

ballardw
Super User

Be very careful about recoding existing variables back into the same variable.

I inherited some code and dataset where code similar to this was used in this situation:

if var=2 then var=1;

else if var=1 then var=0;

Apparently the code was run on the same dataset a couple times as the variable ended up with all 0 values where not missing.

If you recode into new variables then there isn't a problem. But reuse dataset and reuse variables can be dangerous in terms of data content.

jakarman
Barite | Level 11

Why avoiding the SAS work? It is meant to be used for temporary data.

Do you want to have it cleaned up at your regular moments. Do a clean-up with "proc datasets"  

There are possible a lot of datasets there of the type #utl-  being created by eg Sort SQL and possible more.

A well designed SAS installation can deal with this. The saswork should get attention to be very well responsive.

You are changing your issue you are possible having at saswork to an issue with the permanent storage.

Just storing and not replacing you must count 1 time of the needed size, replacing is at least asking for 2 times of the needed size.

Remember:

- SQL does not support this type of creating a table you are using also as input.

  With the introduction of multithreading (SAS 9) this has become logical impossible.

- Checkpoint restart is more advanced approach to be able to do restarts. SAS(R) 9.3 System Options: Reference, Second Edition (STEPCHKPTLIB= System Option)
  This is an more advanced automatisation way of restarting processes more common known to big systems

---->-- ja karman --<-----
MumSquared
Calcite | Level 5

I agree with Jaap Karman too.

Reusing the same data set name throughout your program can also make it harder for anyone else to pick up your code and understand the data flows and what the program is trying to acheive.

Reeza
Super User

I agree with .

I prefer to use different data sets to allow for easier debugging, but then I clean up my process at the end, especially for big jobs.

proc datasets, proc delete and proc sql can all delete tables.

In addition, I use a naming convention for temporary data sets such as temp_1 temp_2 then you can refer to them as temp_: in proc datasets to delete all temp_ datasets.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1047 views
  • 0 likes
  • 7 in conversation