DATA Step, Macro, Functions and more

Clone a dataset ***including indexes***

Accepted Solution Solved
Reply
Super Contributor
Posts: 387
Accepted Solution

Clone a dataset ***including indexes***

Hi,

Is this the best approach to clone a dataset including indexes?  The end result should be an empty dataset with all indexes in place from the source dataset.

options obs=max;

data class;

  set sashelp.class;

run;

data zipcode;

  set sashelp.zipcode;

run;

proc datasets lib=work nowarn nolist;

  modify class;

  index create name;

  index create sex;

  index create age;

  index create primary=(height weight);

  run;

  modify zipcode;

  index create statecode;

  index create statename;

  run;

quit;

libname temp spde "%sysfunc(pathname(work))" temp=yes;

options obs=0;

proc copy in=work out=temp clone;

  select class zipcode;

run;

options obs=max;

proc datasets lib=temp nowarn nolist;

  change class=class2;

  change zipcode=zipcode2;

quit;

proc copy in=temp out=work;

  select class2 zipcode2;

run;

libname temp;

I like the fact that the SPDE engine allows me to create a temporary library whose directory will be deleted by SAS when the library is freed.  Which is why I'm using that approach.

One thing to note is, on my system (Windows), the cloned dataset is "large", even though it's empty.  This is due to the Data Set Page Size being different due to the two different engines being used.  I read the doc on proc copy; to me it seemed that the clone option would create a dataset with the same internal structure as the source dataset (even across the two engines).  My "real" code is SPDE to SPDE, so this won't be an issue.  My real code also needs to clone two datasets.

Now if only SAS supported:

proc sql;

  create table work.class2 like sashelp.class including indexes;

quit;

or something like that.  If the above code IS the current best approach, perhaps SAS could make this a bit easier in a future release?

Regards,

Scott


Accepted Solutions
Solution
‎07-28-2015 02:02 AM
Super User
Posts: 5,441

Re: Clone a dataset ***including indexes***

Posted in reply to ScottBass

My standard way of doing this is:

Proc append base=want data=have (obs=0);

Run;

This will preserve all metadata from the template table.

Data never sleeps

View solution in original post


All Replies
Solution
‎07-28-2015 02:02 AM
Super User
Posts: 5,441

Re: Clone a dataset ***including indexes***

Posted in reply to ScottBass

My standard way of doing this is:

Proc append base=want data=have (obs=0);

Run;

This will preserve all metadata from the template table.

Data never sleeps
Super Contributor
Posts: 387

Re: Clone a dataset ***including indexes***

Thanks Linus, that helped.

Perhaps it's better to see the code in context, so the full picture is evident.  FWIW this approach is based on work from the programmer before me, although I've modified it a bit:

*** Load the data *** ;

%* The desired output is to: ;

%*   1) A backup of yesterdays data named "_old" ;

%*   2) Load todays data with the correct output name ;

%* And to do so with optimal performance and recoverability ;

%* The "flow" is: ;

%*   1) Create an empty skeleton dataset (including indexes) based on the current "fact" dataset ;

%*   2) Delete the "_old" dataset (day before yesterdays data) ;

%*   3) Rename the current "fact" dataset to "_old" (now it is yesterdays data) ;

%*   4) Load todays data into the "_tmp" dataset ;

%*   5) Rename the "_tmp" dataset to the current "fact" dataset ;

%* We could reverse #3 and #4 (rename then load directly) ;

%* but that is more confusing if an error occurs during the load ;

%* and we have to manually recover ;

%macro load_data(type,lib,target);

  %let type=%lowcase(&type);

  %if (&type eq hospital) %then %let source=workspde.claimfact&type._epi_xfr;

  %else

  %if (&type eq medical)  %then %let source=workspde.claimfact&type._epi;

  %* save current syscc value and reset to 0 ;

  %let syscc_temp=&syscc;

  %let syscc=0;

  proc datasets lib=&lib nolist nowarn;

  %if (&syscc eq 0) %then %do

    * Create skeleton dataset including indexes for later use ;

    delete &target._tmp;  %* In case it exists ;

    run;

    append base=&target._tmp data=&target (obs=0);

    run;

  %end;

  %if (&syscc eq 0) %then %do;

    * Delete the "_old" dataset (now the day before yesterdays data) ;

    delete &target._old;

    run;

  %end;

  %if (&syscc eq 0) %then %do;

    * Rename the current "fact" dataset to "_old" (now yesterdays data) ;

    change &target=&target._old;

    run;

  %end;

  %if (&syscc eq 0) %then %do;

    * Load todays data into the "_tmp" dataset ;

    append base=&target._tmp data=&source;

    run;

  %end;

  %if (&syscc eq 0) %then %do;

    * Rename the "_tmp" dataset to the current "fact" dataset (now todays data) ;

    change &target._tmp=⌖

    run;

  %end;

  quit;

  %let syscc=&syscc_temp;

  %bench(elapsed,data=elapsed,message=Load: Load &type data)

%mend;

Super Contributor
Posts: 387

Re: Clone a dataset ***including indexes***

Posted in reply to ScottBass

One last comment, perhaps as much to myself as anyone...but if you want to chime in please do so Smiley Happy

Hmmm...in thinking about this further, I wonder if generation data sets would work well here?  It's been a billion years since I've used them.  Time to revisit the doc!

Edit:  I do wish proc append would roll over the generation group.  If we have a 10M row base table and are appending 1K rows from today's transactions, I don't want to run a full data step just to roll over the generation group.  If there IS a way to roll over the generation group using proc append, let me know.  Otherwise, the "manual" approach to creating a backup from the above post may be the way to go.

proc datasets lib=work nowarn nolist kill;

quit;

* only need to specify the genmax option the first time ;

data class (genmax=3 label="First");

  set sashelp.class;

run;

* another data step ;

data class (label="Second");

  set class;

run;

* proc append (bummer...no work.Class#002) ;

proc append base=class (label="Third") data=sashelp.class;

run;

* one more iteration, the first one will be deleted ;

data class (label="Fourth");

  set sashelp.class;

run;

Added edit about proc append and generation groups

Frequent Contributor
Posts: 102

Re: Clone a dataset ***including indexes***

Posted in reply to ScottBass

What is lacking in PROC COPY INDEX=YES ?

Super Contributor
Posts: 387

Re: Clone a dataset ***including indexes***

Posted in reply to JackHamilton

Thanks Jack.  The target cloned dataset needs to have a different name in the same library as the source dataset.

Frequent Contributor
Posts: 102

Re: Clone a dataset ***including indexes***

Posted in reply to ScottBass

There have been several occasions over the years when I have wanted a built-in utility that would copy and rename at the same time.

In this case, you could just copy the original to the temporary library, rename it there, and then copy it back to the original library.  But the APPEND solution seems easier.

I tried to find the documentation that says APPEND will create indexes for a non-existent BASE when DATA has indexes.  Couldn't find it.  It does says that constraints will be copied, but a non-unique index is not a constraint.  Is this just folk wisdom passed down from programmers of yore?

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 1361 views
  • 3 likes
  • 3 in conversation