BookmarkSubscribeRSS Feed
MisterJenn
Fluorite | Level 6

I am trying to turn the variable 'number' into a character variable then rename it to 'subj_id'. However, it keeps telling me that that subj_id is both character  and numeric. I don’t understand how that is possible if I changed it the variable to character. 

 

Ultimately this is what I am trying to do: interleave the data sets sex_f and sex_m from library hw6 such that the data set pbc is sorted by subject ID (number). Then we wish to create the new variable logalbu, which should be log transformed albumin, and an indicator variable for male sex.

 

libname hw6 '\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6';
data hw6.sex_m_temp;
set hw6.sex_m (rename=(number=subj_id));
number= input(number,8.);
run; 
proc sort data=hw6.sex_f; by subj_id; 
proc sort data=hw6.sex_m_temp; by subj_id;
data pbc;
set hw6.sex_f hw6.sex_m_temp; 
by subj_id;
logalb=log(alb);
if male = 1 then sex = 1;
if female = 1 then sex=0;
run;
40   libname hw6 '\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6';
NOTE: Libref HW6 was successfully assigned as follows:
      Engine:        V9
      Physical Name: \\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6
41   data hw6.sex_m_temp;
42   set hw6.sex_m (rename=(number=subj_id));
43   number= input(number,8.);
44   run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
      43:15
NOTE: There were 276 observations read from the data set HW6.SEX_M.
NOTE: The data set HW6.SEX_M_TEMP has 276 observations and 26 variables.
NOTE: DATA statement used (Total process time):
      real time           0.21 seconds
      cpu time            0.03 seconds


45
46
47
48   proc sort data=hw6.sex_f; by subj_id;

NOTE: Input data set is already sorted, no sorting done.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds


49   proc sort data=hw6.sex_m_temp; by subj_id;

NOTE: There were 276 observations read from the data set HW6.SEX_M_TEMP.
NOTE: The data set HW6.SEX_M_TEMP has 276 observations and 26 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.18 seconds
      cpu time            0.00 seconds


50   data pbc;
51   set hw6.sex_f hw6.sex_m_temp;
ERROR: Variable subj_id has been defined as both character and numeric.
52   by subj_id;
53   logalb=log(alb);
54   if male = 1 then sex = 1;
55   if female = 1 then sex=0;
56   run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.PBC may be incomplete.  When this step was stopped there were 0
         observations and 31 variables.
WARNING: Data set WORK.PBC was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds
2 REPLIES 2
Reeza
Super User
data hw6.sex_m_temp;
set hw6.sex_m;
subj_id= put(number, 8. -l);
drop number;
run; 

You cannot change the type of a variable, with the same name. 

The code above should work, if your goals are:

I am trying to turn the variable 'number' into a character variable then rename it to 'subj_id'. However, it keeps telling me that that subj_id is both character  and numeric. I don’t understand how that is possible if I changed it the variable to character. 

But, SET with a BY is not what you require I suspect. 

 

Do you need to add rows (SET) or columns (MERGE) to your data?

 

Spoiler

@MisterJenn wrote:

I am trying to turn the variable 'number' into a character variable then rename it to 'subj_id'. However, it keeps telling me that that subj_id is both character  and numeric. I don’t understand how that is possible if I changed it the variable to character. 

 

Ultimately this is what I am trying to do: interleave the data sets sex_f and sex_m from library hw6 such that the data set pbc is sorted by subject ID (number). Then we wish to create the new variable logalbu, which should be log transformed albumin, and an indicator variable for male sex.

 

libname hw6 '\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6';
data hw6.sex_m_temp;
set hw6.sex_m (rename=(number=subj_id));
number= input(number,8.);
run; 
proc sort data=hw6.sex_f; by subj_id; 
proc sort data=hw6.sex_m_temp; by subj_id;
data pbc;
set hw6.sex_f hw6.sex_m_temp; 
by subj_id;
logalb=log(alb);
if male = 1 then sex = 1;
if female = 1 then sex=0;
run;
40   libname hw6 '\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6';
NOTE: Libref HW6 was successfully assigned as follows:
      Engine:        V9
      Physical Name: \\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Desktop\Assignment 6
41   data hw6.sex_m_temp;
42   set hw6.sex_m (rename=(number=subj_id));
43   number= input(number,8.);
44   run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
      43:15
NOTE: There were 276 observations read from the data set HW6.SEX_M.
NOTE: The data set HW6.SEX_M_TEMP has 276 observations and 26 variables.
NOTE: DATA statement used (Total process time):
      real time           0.21 seconds
      cpu time            0.03 seconds


45
46
47
48   proc sort data=hw6.sex_f; by subj_id;

NOTE: Input data set is already sorted, no sorting done.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds


49   proc sort data=hw6.sex_m_temp; by subj_id;

NOTE: There were 276 observations read from the data set HW6.SEX_M_TEMP.
NOTE: The data set HW6.SEX_M_TEMP has 276 observations and 26 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.18 seconds
      cpu time            0.00 seconds


50   data pbc;
51   set hw6.sex_f hw6.sex_m_temp;
ERROR: Variable subj_id has been defined as both character and numeric.
52   by subj_id;
53   logalb=log(alb);
54   if male = 1 then sex = 1;
55   if female = 1 then sex=0;
56   run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.PBC may be incomplete.  When this step was stopped there were 0
         observations and 31 variables.
WARNING: Data set WORK.PBC was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.00 seconds

 

Tom
Super User Tom
Super User

You renamed the variable NUMBER to SUBJ_ID, but then did not use the new name.

Instead you made a new variable named NUMBER that SAS decided should be numeric since it is the result of the INPUT() function call using a numeric informat.  You then tried to populate this empty variable by transforming the empty value using the INPUT() function.  

 

41   data hw6.sex_m_temp;
42   set hw6.sex_m (rename=(number=subj_id));
43   number= input(number,8.);
44   run;

NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).
      43:15

So you inefficiently just did this:

 

 

data hw6.sex_m_temp;
  set hw6.sex_m (rename=(number=subj_id));
  number=.;
run;

If you do not already have a variable named SUBJ_ID then there is no need to rename anything.

 

data hw6.sex_m_temp;
  set hw6.sex_m ;
  number=input(subj_id,32.);
run;

NOTE: The INPUT() function does not care if you use a width on the INFORMAT that is larger than the length of the string being read.  So just use 32. (the maximum width of the numeric informat) as the informat in the INPUT() function call.

 

Now if you want the use the name SUBJ_ID for the new numeric variable then you will need to do some renaming.

data hw6.sex_m_temp;
  set hw6.sex_m ;
  number=input(subj_id,32.);
  rename subj_id=subj_id_char number=subj_id;
run;

 

If the goal is to convert a NUMBER into a character string then you need to use PUT() instead of INPUT().  But the same number can be represented as a character string any many different ways (that is the whole point of the BEST format!).  So you need to be clear about the type of strings you have in the other dataset.  What is the range of values? Are they all integers?  What are the smallest and largest possible value?  How many characters will it take to represent the largest value?  That will determine the minimum LENGTH of the new character variable.  And if the number of characters needed to represent the smallest is less then how do you want to fill the other characters?

 

In general the best way is to use leading zeros so that all of the characters strings are the same length.

data hw6.sex_m_temp;
  set hw6.sex_m ;
  string=put(subj_id,Z8.);
  rename subj_id=subj_id_num  string=subj_id;
run;

If instead you use the 8. (or whatever width) format then you probably want to left align the value so that numbers like 10 and 100 don't have a lot of leading spaces.  You could either use the LEFT() function or add the -L format modifier to the end of the format used in the PUT() function call.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1240 views
  • 0 likes
  • 3 in conversation