BookmarkSubscribeRSS Feed

You can currently use the dataset option RENAME= to change the NAME used for a variable.

It would increase flexibility if you could also change other attributes of the variables, such as the FORMAT, INFORMAT or LABEL using a similar syntax.  To avoid conflict with the existing LABEL= option for setting the member name and to follow a similar naming convention as the RENAME= option we might call these options REFORMAT=, REINFORMAT= and RELABEL=.  It should be possible to do it one both input and output dataset references.

 

Example:

data want;
  set sashelp.class (reformat=(age=5.2));
run;

Here is problem that made me think of this.  I was able to attach formats to variables being created with PROC TRANSPOSE by adding a FORMAT statement, but it generated WARNINGs in the log.

88    %let fmtlist=x xfmt. z zfmt.;
88    proc transpose data=have out=want(drop=_name_) ;
89      by id formset ;
90      id var_name;
91      idlabel var_label;
92      var var_value;
93      format &fmtlist;
WARNING: Variable X not found in data set WORK.HAVE.
WARNING: Variable Z not found in data set WORK.HAVE.
94    run;

NOTE: There were 12 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 4 observations and 5 variables.
7 Comments
snoopy369
Barite | Level 11

Might simply add ATTRIB to the data set options available?

 

data want;
  set have(attrib=(x=(format=6.2) y=(format=$12. length=$12 )));
run;

PaigeMiller
Diamond | Level 26

PROC DATASETS allows this right now, without having to read the entire data set. It only changes the metadata. So I'm not seeing the benefit of this particular SASware ballot request.

snoopy369
Barite | Level 11

Two things.

 

First, I think the idea is to simplify it - PROC DATASETS is a bunch of code to do one small thing that's preferable to do inline.  One of the big downsides to SAS is that often it takes a lot of code to do something a line would do fine in Python or most other languages, for no particular reason other than the language didn't have the simpler method added to it.  When SAS has things available inline, it's amazing what you can do in a small amount of code.

 

Second, PROC DATASETS would modify the original dataset.  Just like the RENAME option in the dataset options, it is preferable not to modify the original dataset when it can be avoided when you're just making a change on input.

PaigeMiller
Diamond | Level 26

I'm trading off the need to read every single row of data and then output it (which is what a DATA step does) versus a method that maybe is 2 or 3 more lines of code that operates quickly (which is PROC DATASETS). Since I regularly work with data sets that have millions of observations, using a DATA step to change a format or change a variable name is a very slow and inefficient method. Even if your data sets are maybe 100 records, I'd prefer to see the two different purposes (DATA step and PROC DATASETS) separate and distinct, simply as a way to encourage good practice.

PaigeMiller
Diamond | Level 26

Adding ... if SAS were to add this feature, I don't lose the ability to use PROC DATASETS, so from that point of view, it really makes little difference to me personally. However I do spend a fair amount of time in my workplace, and also here in the SAS Communities, to encourage what I think are good practices, and to discourage what I think are poor practices. So from that point of view, I am opposed to using a DATA step for simply renaming or re-formatting files. And also, I think SAS has more important things to spend it's time on, that's my opinion.

snoopy369
Barite | Level 11

@PaigeMiller I think you misunderstand.  I and Tom use data steps to show the functionality but that’s not the actual usage - Tom shows actual usage effectively in the PROC TRANSPOSE above. He’s suggesting being able to use (REFORMAT=) on the OUT dataset there. No extra work or data passed over - exactly the opposite. 

DmytroYermak
Lapis Lazuli | Level 10

I would like to support the suggestion as faced the situation when needed to understand what was generated after proc mixed: https://communities.sas.com/t5/Statistical-Procedures/Proc-mixed-formats-of-the-output-dataset/m-p/6...