BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dm2018
Calcite | Level 5

Hi,

 

Relatively new SAS user here. I have a number of datasets that may be recreated over time (potentially) using updated code. I want to incorporate some version control using dataset labels. Is there any feature or function that will allow me to automatically add a dataset label showing the filename and path of the project file used to create the dataset? Or will it be easier to create a user defined macro variable to create this label? The preference would be for this to be automated but I can't seem to find a way to do this in the SAS documentation. Wondering if anyone has an answer, or any other advice or guidance please.

 

As I've said, I'm relatively new to SAS so apologies in advance if this is an obvious question. Thanks for your help.

 

Regards,


Dave

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just to highlight here that my first choice would not be separate folders, my words seems to have been taken out of context.

 

Secondly, to what purpose?  An output is only a combination of the input and the processor, in your case, source data and source code to generate the output.  Therefore the key part about this would be versioning those.  

 

This really isn't a question we can accurately answer.  This is the job of an analyst to come in, document what is already present, what is available, what software is being used, what storage facilities there are, backup systems etc.  Then suggest a path through identifying key inputs, outputs, processing sections, moves/copies/branches etc.  For instance if you want to link code to an output, then you could create a new folder and put it all in there, you could use SVN to create a branch with the code/output on, then delete the branch, you can always extract that branch again from the system.  Maybe a tagging system on the output file would work.  If the output is not a dataset (we don't really know what the output is or where its going), then maybe the file could contain information about the run, for instance an XML file can contain lots of self documenting information.  The possibilities are endless.

View solution in original post

8 REPLIES 8
RW9
Diamond | Level 26 RW9
Diamond | Level 26

There isn't an automatic way.  There is the audit functionality:

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a001124621.htm

 

However, I would suggest, if it is really needed, then look into some sort of software or framework to do this type of versioning for you.  For instance there are tools like SVN or GIT if your using network drives.  Some SAS packages come with version control embedded in them (our LSAF install for instance has this embedded).  You could also speak with your IT guys, as they may have something already avaialble.  Its quite hard to say without doing an assessment of needs versus required.  I wouldn't rely on something you do yourself, your just opening yourself up to a world of pain when (not if) it goes wrong.  If its just simple needs, then keeping data in different folders might be a simple if not great solution,

dm2018
Calcite | Level 5

Thanks, that's great. Thought it might be the case that an additional tool was needed. In the interim, a simple folder-based solution should work. Didn't know about the AUDIT functionality either. I'll take a look.

 

Thanks again,


Dave

VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

@RW9has suggested a solution for version control that uses different folders which has been my companies practice for over 20 years.

While today not all of the upcoming new auditors have an understanding of structured directories, and many auditors will not except users saving data without the use of a standard version control software package doing the versioning.

 

Welcome to the new world of that is just not good enough, 

Reeza
Super User

@VDD wrote:

@RW9has suggested a solution for version control that uses different folders which has been my companies practice for over 20 years.

While today not all of the upcoming new auditors have an understanding of structured directories, and many auditors will not except users saving data without the use of a standard version control software package doing the versioning.

 

Welcome to the new world of that is just not good enough, 


I'd be in the camp that strongly doesn't support that. 

It uses more space than is required and there's no guarantee that the file hasn't been changed and moved which you can verify in an version control system. 

 

Take a look at the concept of Generations and AGE within PROC DATASETS. I don't know if you need a full server installation for that.

dm2018
Calcite | Level 5

Thanks, I like the concept of generations allowing for retention of historical versions of a dataset. I'll definitely look into this in more detail. A server reinstall isn't possible so that'll ultimately be the decider.

 

I'm specifically interested in linking a dataset to version controlled code used to create it. I still can't seem to find an automated method for this, I'm still leaning towards using label= to add the a version controlled filename (for the SAS code being run) to the descriptor portion of the dataset. This could be used alongside generations and other dataset options. 

 

Thanks again,


Dave

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Just to highlight here that my first choice would not be separate folders, my words seems to have been taken out of context.

 

Secondly, to what purpose?  An output is only a combination of the input and the processor, in your case, source data and source code to generate the output.  Therefore the key part about this would be versioning those.  

 

This really isn't a question we can accurately answer.  This is the job of an analyst to come in, document what is already present, what is available, what software is being used, what storage facilities there are, backup systems etc.  Then suggest a path through identifying key inputs, outputs, processing sections, moves/copies/branches etc.  For instance if you want to link code to an output, then you could create a new folder and put it all in there, you could use SVN to create a branch with the code/output on, then delete the branch, you can always extract that branch again from the system.  Maybe a tagging system on the output file would work.  If the output is not a dataset (we don't really know what the output is or where its going), then maybe the file could contain information about the run, for instance an XML file can contain lots of self documenting information.  The possibilities are endless.

dm2018
Calcite | Level 5

Thanks RW9. Appreciate there are numerous possibilities. Most likely looking for a solution without SVN or any third-party tools. Got some great ideas and lots to look into.

 

Thanks everyone for all your input!

 

Cheers,

Dave

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1882 views
  • 0 likes
  • 4 in conversation