- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Posted 09-14-2008 12:59 PM
(4265 views)
Hello,
I want to make a backup of a dataset(lib.dataset1).
I make a copy of dataset and name it lib.dataset1_bck.
This works very simple, i just use statements. data lib.dataset1_bck; set lib.dataset1; run;
The problem is about index . Then using this method of making backups, my index just dissapear. If I use dataset option index then the index is created. But it's only recreating index'es that had been made, and creating index takes time.
Can anyone tell if there's any method of just copying index, or any simple method of duplicating dataset.
I want to make a backup of a dataset(lib.dataset1).
I make a copy of dataset and name it lib.dataset1_bck.
This works very simple, i just use statements. data lib.dataset1_bck; set lib.dataset1; run;
The problem is about index . Then using this method of making backups, my index just dissapear. If I use dataset option index then the index is created. But it's only recreating index'es that had been made, and creating index takes time.
Can anyone tell if there's any method of just copying index, or any simple method of duplicating dataset.
15 REPLIES 15
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can try using PROC COPY with INDEX= YES option to copy a dataset with index.
/*Sample code*/
PROC COPY OUT=work IN=mylib
MEMTYPE=data
INDEX=YES;
SELECT mytabel;
run;
/*Sample code ends*/
Hope this helps.
/*Sample code*/
PROC COPY OUT=work IN=mylib
MEMTYPE=data
INDEX=YES;
SELECT mytabel;
run;
/*Sample code ends*/
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the advice, but it does not work, when I make backup in the same library.
I get the warning message like this: in and out are the same:
WARNING: IN= and OUT= are the same. Files will not be copied into themselves.
I get the warning message like this: in and out are the same:
WARNING: IN= and OUT= are the same. Files will not be copied into themselves.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC COPY cannot copy to the same directory, it has no rename functionality. To do a copy in the same library, use a data step together with the index= data set option. One drawback of this method is that you have to re-define your index definition.
A little more "crazy" solution is to use concatenated librefs (for your ordinary use). But your copy job will use it a two seperate librefs:
libname LibA './DirA';
libname LibB './DirB';
data Liba.dsa(index=(vara));
VarA=5;
run;
proc datasets lib=LibB nolist;
delete dsb;
quit;
proc datasets lib=LibA nolist;
copy out=LibB;
select dsa;
quit;
proc datasets lib=LibB nolist;
change dsa=dsb;
quit;
libname LibAB ('./DirA,'./DirB);
/* List contents of concatenated library */
proc datasets lib=LibAB;
quit;
/Linus
A little more "crazy" solution is to use concatenated librefs (for your ordinary use). But your copy job will use it a two seperate librefs:
libname LibA './DirA';
libname LibB './DirB';
data Liba.dsa(index=(vara));
VarA=5;
run;
proc datasets lib=LibB nolist;
delete dsb;
quit;
proc datasets lib=LibA nolist;
copy out=LibB;
select dsa;
quit;
proc datasets lib=LibB nolist;
change dsa=dsb;
quit;
libname LibAB ('./DirA,'./DirB);
/* List contents of concatenated library */
proc datasets lib=LibAB;
quit;
/Linus
Data never sleeps
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the advice, but it's still not what i need.
I try to explain this by making an exaple.
my process has to make backup of table etl.dataset.
I make copy in another library. backup.dataset.
If my process craches i run a recovery script, which does these steps:
deletes etl.dataset
copies dataset backup.dataset into etl.dataset.
But these steps are quite irritating, and not sure if they complete without errors (for example lack of space).
On the other hand, if i make backup copy in one library (etl.dataset_bck), recovery steps are simplier, and does not use memory:
deletes etl.dataset
renames dataset etl.dataset_bck to etl.dataset.
So the questions is if there's any simple method of making dublicate in the same library. The method which does not drop index'es.
I try to explain this by making an exaple.
my process has to make backup of table etl.dataset.
I make copy in another library. backup.dataset.
If my process craches i run a recovery script, which does these steps:
deletes etl.dataset
copies dataset backup.dataset into etl.dataset.
But these steps are quite irritating, and not sure if they complete without errors (for example lack of space).
On the other hand, if i make backup copy in one library (etl.dataset_bck), recovery steps are simplier, and does not use memory:
deletes etl.dataset
renames dataset etl.dataset_bck to etl.dataset.
So the questions is if there's any simple method of making dublicate in the same library. The method which does not drop index'es.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the advice, but it's still not what i need.
I try to explain this by making an exaple.
my process has to make backup of table etl.dataset.
I make copy in another library. backup.dataset.
If my process craches i run a recovery script, which does these steps:
deletes etl.dataset
copies dataset backup.dataset into etl.dataset.
But these steps are quite irritating, and not sure if they complete without errors (for example lack of space).
On the other hand, if i make backup copy in one library (etl.dataset_bck), recovery steps are simplier, and does not use memory:
deletes etl.dataset
renames dataset etl.dataset_bck to etl.dataset.
So the questions is if there's any simple method of making dublicate in the same library. The method which does not drop index'es.
I try to explain this by making an exaple.
my process has to make backup of table etl.dataset.
I make copy in another library. backup.dataset.
If my process craches i run a recovery script, which does these steps:
deletes etl.dataset
copies dataset backup.dataset into etl.dataset.
But these steps are quite irritating, and not sure if they complete without errors (for example lack of space).
On the other hand, if i make backup copy in one library (etl.dataset_bck), recovery steps are simplier, and does not use memory:
deletes etl.dataset
renames dataset etl.dataset_bck to etl.dataset.
So the questions is if there's any simple method of making dublicate in the same library. The method which does not drop index'es.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The SAS PROC COPY procedure is used for this purpose -- specifically to make a back of the entire SAS data library, typically to be used for recovery purposes, as needed. Other approaches, such as using a DATA step to "duplicate" your production SAS database table(s) is somewhat short-sighted and may not satisfy a complete and effective recovery scenario.
Also, consider that your D/R (backup and recovery) approach is going to be influenced by the availability of technical (data storage, scheduling, staffing) resources, as well as your operating environment (platform). This topic must be considered from a data management level, rather than at the lowest (SAS data table/index) level, frankly, if it is to be considered seriously for an enterprise data warehouse / repository initiative.
Scott Barry
SBBWorks, Inc.
SAS.COM support site http://support.sas.com - technical paper references on this topic:
http://support.sas.com/resources/papers/sgf2008/recovery.pdf
http://support.sas.com/rnd/papers/sgf07/sgf2007-iosubsystem.pdf
http://support.sas.com/documentation/onlinedoc/spds/admin443.pdf
Also, consider that your D/R (backup and recovery) approach is going to be influenced by the availability of technical (data storage, scheduling, staffing) resources, as well as your operating environment (platform). This topic must be considered from a data management level, rather than at the lowest (SAS data table/index) level, frankly, if it is to be considered seriously for an enterprise data warehouse / repository initiative.
Scott Barry
SBBWorks, Inc.
SAS.COM support site http://support.sas.com - technical paper references on this topic:
http://support.sas.com/resources/papers/sgf2008/recovery.pdf
http://support.sas.com/rnd/papers/sgf07/sgf2007-iosubsystem.pdf
http://support.sas.com/documentation/onlinedoc/spds/admin443.pdf
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello sbb
Ok forget about what i said about backups, its not that questions. Frankly, i'm using this method of making backups and a recovery process, and it works. The main point about making backups is that my process updates only some datasets in the entire library, so i need to focus only on several datasets, which are beeing updated.
But forget about backup processes, because the question is about making a dublicate in the same library:
Now there are two ways:
first way:
first step: make a copy of < dataset data a_bck; set a; run; >
second step make index.
we can tell program to make index while making a table, but it still uses two steps.
second way
firts step make a copy to another library with proc copy and option: index=yes.
second step change name.
third step Make a copy backwards with options index = yes and move.
Both ways are quit long, because every of them is using more then one operation.
So the question is if there's a convienent way to make a dublicate dataset in the same library.
Ok forget about what i said about backups, its not that questions. Frankly, i'm using this method of making backups and a recovery process, and it works. The main point about making backups is that my process updates only some datasets in the entire library, so i need to focus only on several datasets, which are beeing updated.
But forget about backup processes, because the question is about making a dublicate in the same library:
Now there are two ways:
first way:
first step: make a copy of < dataset data a_bck; set a; run; >
second step make index.
we can tell program to make index while making a table, but it still uses two steps.
second way
firts step make a copy to another library with proc copy and option: index=yes.
second step change name.
third step Make a copy backwards with options index = yes and move.
Both ways are quit long, because every of them is using more then one operation.
So the question is if there's a convienent way to make a dublicate dataset in the same library.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I kind really figure out the deal with the index. How often do you have to do a recovery? Not too often I hope. And in that case, the time spent on copying the index each night (?) might be far more than the time you have spending on recreating the index as a part of a recovery process.
/Linus
/Linus
Data never sleeps
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Either a DATA step within the same LIBREF to make a duplicate copy or use PROC COPY to another LIBREF, use CHANGE to change the name and then use PROC COPY to copy the member(s) back to the original LIBREF. No question that PROC COPY is faster than a DATA step, in most cases.
Scott Barry
SBBWorks, Inc.
Scott Barry
SBBWorks, Inc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your answers sbb and Linus H.
I think there's a way of making a copy not of dataset but making a copy of phisycal files with call system . Sas index'es are stored in file with extention sasnbdx. I'll write later if i succeed.
I think there's a way of making a copy not of dataset but making a copy of phisycal files with call system . Sas index'es are stored in file with extention sasnbdx. I'll write later if i succeed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This should work if you move them to another directory, and if you don't rename them. Caution: this won't probably work if you chose to use SPDE in the future.
/Linus
/Linus
Data never sleeps
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
On unix you can make duplicate of file in the same dir.
And yes i'm not talking about spds libraries.
And yes i'm not talking about spds libraries.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes it worked.
if you want to make copy of a dataset in the same library you should use call system and copy both sasdataset file and index file.
for instance like this:
data _null_;
call system("cp /dir1/test1.sas7bdat /dir1/test1_bck.sas7bdat");
call system("cp /dir1/test1.sas7bndx /dir1/test1_bck.sas7bndx");
run;
i think there has to be some options set which lets use call system routine.
if you want to make copy of a dataset in the same library you should use call system and copy both sasdataset file and index file.
for instance like this:
data _null_;
call system("cp /dir1/test1.sas7bdat /dir1/test1_bck.sas7bdat");
call system("cp /dir1/test1.sas7bndx /dir1/test1_bck.sas7bndx");
run;
i think there has to be some options set which lets use call system routine.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
to create a copy including index, of a data set in a SAS library, you can use proc append, like:[pre]proc datasets library= mylib nolist ;
delete mycopyDS ;
run ;
quit ;
proc append base= mylib.mycopyDS data= mylib.originalDS ;
run ;[/pre]
probably it could be done entirely in PROC DATASETS
hope it helps
PeterC
delete mycopyDS ;
run ;
quit ;
proc append base= mylib.mycopyDS data= mylib.originalDS ;
run ;[/pre]
probably it could be done entirely in PROC DATASETS
hope it helps
PeterC