Changing variables' order and format

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

Changing variables' order and format

Hi,

I'm just curious why the third data step ("solution_new") in the example below works. Is having a variable twice in the format-statement a propper way to change its format and does it work in any situation?

Thanks!

Data have;

    Format

        Type $6.

        Date date11.;

    Input

        Type :$6.

        Date :date11.;

Datalines;

TypeA1 23-JUL-2013

TypeB2 23-JUL-2013

TypeA1 23-JUL-2013

Run;

Data solution_old;

    * format-statement BEFORE set-statement to get variables in the wanted order ;

    Format

        Date date11.

        Type $6.;

    Set have;

    * format-statement AFTER set-statement to change the format of some variables ;

    Format Type $2.;

    If Type="TypeA1" Then Do;

        Type="A1";

        Output;

    End;

Run;

Data solution_new;

    * this time I'm also changing the variables formats here ;

    Format

        Date date11.

        Type $6.

        Type $2.;

    SET have;

    If Type="TypeA1" Then Do;

        Type="A1";

        Output;

    End;

Run;


Accepted Solutions
Solution
‎08-16-2013 10:42 AM
Super User
Posts: 11,343

Re: Changing variables' order and format

Posted in reply to Georg_UPB

FORMAT controls DISPLAY of values. It has no impact on length or content of variables. You set the length to 6 when you used an INFORMAT of $6. in the input statement in your first data step. INFORMATS are different than formats as they are used to READ data. SAS will use the length of an informat as default length if a prior LENGTH statement has not assigned a specified length to the variable prior to input.

Use

Length newvar $ 4;

or what ever the desired length of the new variable may be instead of the 4 before assigning a value to the variable to explicitly control the length of the result.


View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: Changing variables' order and format

Posted in reply to Georg_UPB

I dont see there is any difference between the old and new datasets.

eitherways have produced the same results and the formats assigned to the variable type is also same.

could you please let me know if there is any specific difference you are referring to

Thanks,

Jagadish

Thanks,
Jag
Super User
Posts: 11,343

Re: Changing variables' order and format

Posted in reply to Georg_UPB

Georg_UPB wrote:

Hi,

I'm just curious why the third data step ("solution_new") in the example below works. Is having a variable twice in the format-statement a propper way to change its format and does it work in any situation?

The last format statement encountered in a data step will set the format for a variable, so when you put TYPE in the format statement twice it only uses the last occurence. It doesn't matter if it is before or after set statements.

Format is also a "universal" statement. In a datastep it will make a specific format a default when that variable is referenced. You can use a different format during most procedures to change behavior temporarily for that procedure call.

An example with a procedure:

Proc print data=solution_new;

format date mmddyy10. type $1.;

run;

Super User
Posts: 5,504

Re: Changing variables' order and format

Posted in reply to Georg_UPB

One variation that would be a mistake would be to change to this variation before the SET statement:

format date date11.

          type $2.;

The reason you have to mention TYPE twice is that the first mention sets the length of TYPE as $6 (as well as assigning the format), and the second changes the format but cannot change the length.

Also note that you cannot tell whether the FORMAT statement is working, or whether the SET statement changes the format for TYPE.  Since all your remaining values are only 2 characters long (after applying the IF THEN statements), you can't tell by printing what the actual format is.  Run a PROC CONTENTS to see which format is in effect.

Good luck.

Contributor
Posts: 38

Re: Changing variables' order and format

Posted in reply to Georg_UPB

Thank you all for your helpful answers!

I think it's best when I first state my problem: I have datasets with several variables that are up to 50 characters long from which I only need the last 4 characters. First, I shorten them which is easy (here: Type="TypeA1" => Type="A1"). Second, I also want to change their format from $50. to $4. (here: $6. => $2.) which is not so easy. How do I accomplish that in a propper way (and without getting any warnings)?

My best solution would be the introduction of a new variable:

Data solution(Drop=Type_old);

    Format

        Date date11.

        Type_new $2.;

    Set have(Rename=(Type=Type_old));

    If Substr(Type_old,Length(Type_old)-1)="A1" Then Do;

        Type_new=Substr(Type_old,Length(Type_old)-1);

        Output;

    End;

Run;

Proc Contents

    Data=solution;

QUIT;

@Jagadishkatam

You're right, they yield the same result, but I don't think that variables are supposed to be mentioned twice in format statements. Also, according to proc contents the format of Type is $2., but its length is still 6 (for both solutions).

@ballardw

Thank you for your insights on the format statement. I'm not quite sure what the procedure does, because the output still says: length=6, format=$2.

@Astounding

Thank you. The procedure says: length=6, format=$2.

added an example

Solution
‎08-16-2013 10:42 AM
Super User
Posts: 11,343

Re: Changing variables' order and format

Posted in reply to Georg_UPB

FORMAT controls DISPLAY of values. It has no impact on length or content of variables. You set the length to 6 when you used an INFORMAT of $6. in the input statement in your first data step. INFORMATS are different than formats as they are used to READ data. SAS will use the length of an informat as default length if a prior LENGTH statement has not assigned a specified length to the variable prior to input.

Use

Length newvar $ 4;

or what ever the desired length of the new variable may be instead of the 4 before assigning a value to the variable to explicitly control the length of the result.


Super User
Super User
Posts: 7,043

Re: Changing variables' order and format

Posted in reply to Georg_UPB

If you want to define your variables use a LENGTH or ATTRIB statement.  Most of the time you do NOT want formats attached to character variables as it can lead to confusion.

data solution ;

  length date 8 type_new $2 ;

  set have(rename=(type=type_old));

  type_new=substr(type_old,length(type_old)-1);

  if type_new="A1" ;

  format date date11. ;

  drop type_old;

run;

Contributor
Posts: 38

Re: Changing variables' order and format

Thank you all very much!

From now on, I'll introduce new variables and define their length by using a LENGTH statement before assigning any values.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 926 views
  • 3 likes
  • 5 in conversation