Help using Base SAS procedures

Is there a proc to add new observations sorted immediately?

Reply
Occasional Contributor
Posts: 5

Is there a proc to add new observations sorted immediately?

Dear community,

I'm looking for a proc to add new observations (from a dataset) into a sorted existing one, such that the proc immediately sorts.

Of course it can be done with a proc append followed by a proc sort or in a data step as follows:

data main_dataset;

     set main_dataset

           new_dataset;

      by sort_variable1 sort_variable2 etc;

run;

But this reads through the entire main set and I'm dealing with rather large files. I'd prefer a proc that uses an index to place the new observations sorted into the main set, so that it is faster.

Is there any, and if yes, how does it work?

Regards,

Warri

(Note that the 'sql' in the tags is there because proc sql may solve my problem)

PROC Star
Posts: 7,363

Is there a proc to add new observations sorted immediately?

I think you are looking for proc datasets using the append statement.  Take a look at: http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a000235213.htm

Occasional Contributor
Posts: 5

Is there a proc to add new observations sorted immediately?

I have tried this and it seems that proc append or proc datasets using the append statement does not sort for you. So I interpret the documentation where it says that the append maintains the sort-indicator that this only happends if the sticking the new dataset under the main one makes it sorted.

Frequent Contributor
Posts: 104

Is there a proc to add new observations sorted immediately?

Just a thought, if the large base dataset is indexed (create via PROC datasets), then proc datasets append statement may be able to append and update the index without physically sorting the very large tables.  The link Art provided describes appending to an indexed dataset.

Regular Contributor
Posts: 184

Is there a proc to add new observations sorted immediately?

Indexes are the feature intended to fit this requirement. The insertions can also be done with PROC SQL or with a DATA step (using the MODIFY statement).

Whether this approach is advantageous vis a vis appending and sorting will depend on the particulars.

DLing wrote:

Just a thought, if the large base dataset is indexed (create via PROC datasets), then proc datasets append statement may be able to append and update the index without physically sorting the very large tables.  The link Art provided describes appending to an indexed dataset.

Super User
Posts: 5,257

Is there a proc to add new observations sorted immediately?

No, there is no proc or other mechanism that appends/inserts data to a table, and then automatically keep the target tables physically sort order.

The only situation I can think of if you have a base table of transactions sorted by date/datetime, and add new records in chronological order.

This due to the fact that a SAS table is a sequential file. It does not (generally) keep space for later inserts. So, to interleaving records, you do need to reload the table. Interleaving can be done using append/sort as you mentioned, or by using data - set - by, which probably should be more efficient in your case, since only the new transactions has to be sorted (the target table are already sorted.

Keeping a table sorted is usually good for performance especially when the sort order is often used as search criteria, and has an index defined on it. But you need to try to calculate so that the resources used to keep a table sorted is less what you gain when querying the table.

/Linus

Data never sleeps
Occasional Contributor
Posts: 5

Is there a proc to add new observations sorted immediately?

Not what I hoped for, but I cannot find something on the web that contradicts your answer. So thanks and I suppose we just have to increase server capacity Smiley Wink.

Super User
Posts: 9,681

Is there a proc to add new observations sorted immediately?

Agree with LinusH. As far as I know there is not such proc to match your demand.

I think using proc append and proc sort is the best way.

If you create an index, It will also waste a lot of time for large table( almost like proc sort ' time).

Maybe SAS will develop some technology to solve this problem for large table in the future. Just a think.

Ksharp

Ask a Question
Discussion stats
  • 7 replies
  • 270 views
  • 6 likes
  • 6 in conversation