BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
shimengj
Fluorite | Level 6

The Taster variable is made up of number(XX)and gender(M/F). For example, 32F (no blank between) and 31M. Here is my code,  this does not have an error but the new variable rows,subj_gender, and subj_age, are empty. My aim is to have gender and age variables, which split from the Taster variable.


data wineA2;
 set wineA;
  subj_char = Taster;
  subj_gender = subject;
  subj_age = subject+0;
  drop Taster;
  run;
proc print data=wineA2;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Actual values in the form of data step code are the best way to provide example data.

The following creates a small data set using a data step to create the two values you mention and then separate them.

data have;
  input taster $;
datalines;
32F
31M
;

data want;
   set have;
   sub_age = input(compress(taster,,'DK'),f5.);
   sub_gender=compress(taster,,'D');
run;

The Compress function removes (or keeps) lists of characters present in your value. In this case no list is provided as the option 'D' indicates digit characters. In the first line the  K coupled with D keeps digits. Then the input function is used to create the numeric value. The second drops all Digit characters to leave the gender.

 

Note: while it will work sometimes the use of +0 to make a numeric value is subject to implied conversions, in this case none because you used the whole value and F or M are not going to turn into numeric values with the simple conversion of text to number rules.

View solution in original post

4 REPLIES 4
shimengj
Fluorite | Level 6

I noticed that the "subject" should be "Taster". I ran the new code, subj_age is still empty and subj_gender has data. However, subj_gender still has character and numerical data.

ballardw
Super User

Actual values in the form of data step code are the best way to provide example data.

The following creates a small data set using a data step to create the two values you mention and then separate them.

data have;
  input taster $;
datalines;
32F
31M
;

data want;
   set have;
   sub_age = input(compress(taster,,'DK'),f5.);
   sub_gender=compress(taster,,'D');
run;

The Compress function removes (or keeps) lists of characters present in your value. In this case no list is provided as the option 'D' indicates digit characters. In the first line the  K coupled with D keeps digits. Then the input function is used to create the numeric value. The second drops all Digit characters to leave the gender.

 

Note: while it will work sometimes the use of +0 to make a numeric value is subject to implied conversions, in this case none because you used the whole value and F or M are not going to turn into numeric values with the simple conversion of text to number rules.

mkeintz
PROC Star

Are you saying that "32F" represents a 32-year-old female?

 

If so, then (untested due to absence of a working data set):

 

data wineA2;
  set wineA;
  length subj_gender $1;
  subj_gender=char(taster,length(taster));
  subj_age=input(translate(taster,'',subj_gender),best3.);
run;

The TRANSLATE function converts any character equal to SUBJ_GENDER to a blank.   The INPUT function uses the BEST3. informat to allow for 100-year-old tasters.  Otherwise BEST2. would do.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

The INPUT() function does not care if you use a width on the INFORMAT that is longer than the length of the string you are reading.  The maximum width for the normal numeric informat is 32.  So just use an informat specification of:  32.

 

Note: BEST is the name of a FORMAT.  There is no BEST informat. If you do use that name as an informat then SAS will assume you meant to use the normal numeric informat.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 618 views
  • 2 likes
  • 4 in conversation