BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Hi,

 

The goal is to change the question answer numbers such that 1 - 5, 2 = 4, 4=2, and 5 =1.

When done with character variables, this works fine. But the same code does not work with numerical variables. With the numerical variables, the output of the array file looks the same as the original file.  Why doesn't this work the same?

---------------------

Here is the code with (Ques1-Ques5) being Character Variables:

Libname Learn'/folders/myfolders/Learn' ;

Data Survey1 ;
    Set Learn.Survey1 ;
    Array Ques(*) $ Q1-Q5 ;
    do i = 1 to dim(Ques) ;
        Ques{i} = translate(Ques{i},'54321','12345') ;
    end ;
    drop i;
run ;

-----------------

Here is a similar code, in which Ques1-Ques5 are numerical variables.

 

Libname Learn'/folders/myfolders/Learn' ;

Data Survey2 ;
    Set Learn.Survey2 ;
    Array QuesNum(*)  $ Q1-Q5 ;
    do i = 1 to dim(QuesNum) ;
        QuesNum{i} = translate(QuesNum{i},'54321','12345') ;
    end ;
    drop i;
run ;

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You cannot insert other steps (like a PROC FORMAT step) into the middle of a DATA step.

 

PROC FORMAT is only used to create user defined formats. If you merely wish to attach a format to a variable just use the FORMAT statement. 

 

 

The PUT() statement needs a format specification. The format type needs match the value/variable type.  You can tell that $8. is a character format since it starts with a $. So you can only apply that format to character values/variables. If your variable NumQues1 is a number then you need to use a numeric format and not a character format.  If your values are really just single digits then you can use the format 1. which can also be specified as the F1. format.

 

It kind of looks like you are reading in numeric variables, converting them to character strings, translating the character strings, and then converting them back to numeric values.

 

So here is code you could try.  The array OLD will have the names of the variables read from the dataset and the array NEW will have the names of the new variables that are to be created.  The STR array will have the one character strings used to convert.  It is not really needed since you could just call the PUT() function in the next line.

 

The PUT() function will convert the numbers to character strings. It uses the F1. numeric format.

The INPUT() function will convert the translated strings to numers. It uses the F1. numeric informat.

 

data Survey2 ;
  set Learn.Survey2 (Rename = 
(Ques1= NumQues1
 Ques2= NumQues2
 Ques3= NumQues3
 Ques4= NumQues4
 Ques5= NumQues5))
  ;
  array old (5) 8 NumQues1-NumQues5 ;
  array str (5) $1 Ques1-Ques5 ;
  array new (5) 8 QuesNum1-QuesNum5 ;
  do i = 1 to dim(old);
   str(i) = put(old(i),F1.) 
   new(i) = input( translate( str(i), '54321', '12345') , F1. ) ;
  end ;
  drop =  i NumQues1-NumQues5;
run ;

 

View solution in original post

10 REPLIES 10
Kurt_Bremser
Super User

Translate() is a character function, and for it to work correctly with numerical variables, you have to make sure it works on correctly formatted strings. Do not rely on SAS's automatic conversion mechanics. I suggest you use put() and strip() for num to char, and input() for conversion back to num().

 

 

ManitobaMoose
Quartz | Level 8

Great. I have tried that but am having problems, getting errors with converting numerical to characters using the put function.  Could you suggest how to do that?

 

I tried the following below, but it will not let me simply format to $8. It wants a specific format, such as Dollar10. etc, which is not appropriate here. I am still learning SAS, and am relatively new, so bare with me a bit. Thanks.

 

Data Survey2 ;
    Set Learn.Survey2 (Rename = (Ques1= NumQues1
                                Ques2= NumQues2
                                Ques3= NumQues3
                                Ques4= NumQues4
                                Ques5= NumQues5)) ;
    Proc Format ;
        Value QuesFmt $ 8. ;  /*THIS GENERATES AN ERROR*/
    run ;
    Ques1 =  put(NumQues1, QuesFmt) ;
    Ques2 =  put(NumQues2, QuesFmt) ;
    Ques3 =  put(NumQues3, QuesFmt) ;
    Ques4 =  put(NumQues4, QuesFmt) ;
    Ques5 =  put(NumQues5, QuesFmt) ;
    Array QuesNum(5)  $ (Ques1-Ques5) ;
    do i = 1 to 5(QuesNum) ;
        QuesNum{i} = translate(QuesNum{i},'54321','12345') ;
    end ;
    drop =  i NumQues1 NumQues2 NumQues3 NumQues4 NumQues5;
run ;

Tom
Super User Tom
Super User

You cannot insert other steps (like a PROC FORMAT step) into the middle of a DATA step.

 

PROC FORMAT is only used to create user defined formats. If you merely wish to attach a format to a variable just use the FORMAT statement. 

 

 

The PUT() statement needs a format specification. The format type needs match the value/variable type.  You can tell that $8. is a character format since it starts with a $. So you can only apply that format to character values/variables. If your variable NumQues1 is a number then you need to use a numeric format and not a character format.  If your values are really just single digits then you can use the format 1. which can also be specified as the F1. format.

 

It kind of looks like you are reading in numeric variables, converting them to character strings, translating the character strings, and then converting them back to numeric values.

 

So here is code you could try.  The array OLD will have the names of the variables read from the dataset and the array NEW will have the names of the new variables that are to be created.  The STR array will have the one character strings used to convert.  It is not really needed since you could just call the PUT() function in the next line.

 

The PUT() function will convert the numbers to character strings. It uses the F1. numeric format.

The INPUT() function will convert the translated strings to numers. It uses the F1. numeric informat.

 

data Survey2 ;
  set Learn.Survey2 (Rename = 
(Ques1= NumQues1
 Ques2= NumQues2
 Ques3= NumQues3
 Ques4= NumQues4
 Ques5= NumQues5))
  ;
  array old (5) 8 NumQues1-NumQues5 ;
  array str (5) $1 Ques1-Ques5 ;
  array new (5) 8 QuesNum1-QuesNum5 ;
  do i = 1 to dim(old);
   str(i) = put(old(i),F1.) 
   new(i) = input( translate( str(i), '54321', '12345') , F1. ) ;
  end ;
  drop =  i NumQues1-NumQues5;
run ;

 

ManitobaMoose
Quartz | Level 8

Thanks for everypne's help!

 

I finally settled on the code below, which works.

-------------------------

Libname Learn'/folders/myfolders/Learn' ;

Data Survey2 ;
    Set Learn.Survey2 (Rename = (Ques1= NumQues1
                                Ques2= NumQues2
                                Ques3= NumQues3
                                Ques4= NumQues4
                                Ques5= NumQues5)) ;
    Ques1 =  put(NumQues1, $8.) ;
    Ques2 =  put(NumQues2, $8.) ;
    Ques3 =  put(NumQues3, $8.) ;
    Ques4 =  put(NumQues4, $8.) ;
    Ques5 =  put(NumQues5, $8.) ;
    Array QuesNum{5} Ques1-Ques5 ; /* DO NOT PUT FORMAT OF QUES1=QUES5 AFTER ARRAY STATEMENT*/
    do i = 1 to 5 ;
        QuesNum{i} = translate(QuesNum{i},'54321','12345') ;
    end ;
    FinalNumQues1 = input(Ques1, 8.) ;
    FinalNumQues2 = input(Ques2, 8.) ;
    FinalNumQues3 = input(Ques3, 8.) ;
    FinalNumQues4 = input(Ques4, 8.) ;
    FinalNumQues5 = input(Ques5, 8.) ;
    
    drop i NumQues1 NumQues2 NumQues3 NumQues4 NumQues5;
run ;

Kurt_Bremser
Super User

You have not followed my advice to use the functions I named. Just expand your use of translate():

QuesNum{i} = input(translate(strip(put(QuesNum{i},best.)),'54321','12345'),best.);

The translate() function now gets a string without blanks (the put() will create leading blanks), and the result will be converted back to numeric.

ManitobaMoose
Quartz | Level 8

Thanks. That is useful. I will try that.

Astounding
PROC Star

It's complicated, using a character function on numeric variables.  Just take the simple route to make this an easy problem:

 

data want;

set have;

array q {5};

do i=1 to 5;

   select (q{i});

      when (1) q{i}=5;

      when (2) q{i}=4;

      when (4) q{i}=2;

      when (5) q{i}=1;

      otherwise;

   end;

end;

drop i;

run;

Tom
Super User Tom
Super User

Your TRANSLATE() trick works well for single character codes.

For numeric codes you can just use subtraction. That will work even for codes with more than 10 values.

data have ;
  input q1-q5 ;
cards;
1 2 3 4 5
3 4 3 2 1
. 4 3 2 1
;

proc print data=have;
run;

data want ;
  set have ;
  array q (5) ;
  do i=1 to dim(q);
    q(i)=6-q(i);
  end;
  drop i;
run;
proc print data=want;
run;
Obs    q1    q2    q3    q4    q5

 1      1     2     3     4     5
 2      3     4     3     2     1
 3      .     4     3     2     1

PGStats
Opal | Level 21

So many ways to do the same thing. I would tend to choose something flexible as :

 

data have ;
  input q1-q5 ;
cards;
1 2 3 4 5
3 4 3 2 1
. 4 3 2 1
;

data want;
array nq{0:5} _temporary_ (. 5 4 3 2 1);
array q q1-q5;
set have;
do i = 1 to dim(q);
    q{i} = nq{coalesce(q{i},0)};
    end;
drop i;
run;

proc print noobs; run;
PG
ballardw
Super User

Personally if I need to recode variables I prefer to recode question variables into analysis variables as new variables not replacing the old. Just in case there is an error or unexpected occurrence because of the values recorded in the original variable I can then look at an entire record and see where the complication occurred.

 

For most things I would start with formats if the variable is only to be used as a categorical analysis variable and only worry about recoding to numeric if I need to treat the variable as interval or ratio level. Formats are a quick way to change grouping without adding variables for categorical analysis. Frequently you may want to collapse some questions from 5 to 4,3 or even two categories for based on sample size and analysis. Running a procedure with a different format for the categories is much more efficient in terms of time, code and data maintenance than adding separate sets of variables to the analysis with different groupings.

 

Another option is reading the data with a custom informat to begin with if it was determine the original question really should have been coded as you indicate.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 5436 views
  • 3 likes
  • 6 in conversation