BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Hello_there
Lapis Lazuli | Level 10
data have1;
infile datalines dsd dlm=",";
*length value $200;
	input layer $ sort value $ 20.;
	value_n=input(substr(value, 1, 3), best.);
datalines;
001, 1, 20 (13.00)
002, 5, 35 (12.08)
003, 1, 267 (21.00)
004, 2, 28 (72.00)
005, 2, 86 (2.23)
006, 3, 24 (13.69)
007, 4, 21 (29.14)
008, 5, 7 (9.02)
009, 5, 56 (17.00)
010, 2, 21 (56.21)
;
run;

Hi,

Value is a character variable that is in the form xx (xx.xx). I'm trying to turn the xx into a new numerical variable called value_n so that i can rank it.

I tried using the substr function, but as you can see, there is one 1 digit number, and 1 3 digit number that causes the problem when i extend the parameters to the first 3. When this happens the row w/ a 7 gets a missing value.

 

Is there another way to do this?

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Don't use substr, because it retrieves a fixed count of characters, which creates the problem you have when you really need to accommodate a variable number of characters.  Use the SCAN function which allows you to retrieve the n'th "word" from a string - in your case you want the 1st word, where words are delimited by blanks.

 

I.e. change

	value_n=input(substr(value, 1, 3), best.);

to

    value_n=input(scan(value,1,' '),best.);

Change the second argument of SCAN to 2 to get the second "word", 3 for the 3rd, etc.

 

But if you always want the last word and you have a variable number of words in the character variable, use a negative offset, as in:

    value_last=input(scan(value,-1,' '),best.);

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
mkeintz
PROC Star

Don't use substr, because it retrieves a fixed count of characters, which creates the problem you have when you really need to accommodate a variable number of characters.  Use the SCAN function which allows you to retrieve the n'th "word" from a string - in your case you want the 1st word, where words are delimited by blanks.

 

I.e. change

	value_n=input(substr(value, 1, 3), best.);

to

    value_n=input(scan(value,1,' '),best.);

Change the second argument of SCAN to 2 to get the second "word", 3 for the 3rd, etc.

 

But if you always want the last word and you have a variable number of words in the character variable, use a negative offset, as in:

    value_last=input(scan(value,-1,' '),best.);

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Hello_there
Lapis Lazuli | Level 10
Thanks, worked like a charm!

Catch up on SAS Innovate 2026

Nearly 200 sessions are now available on demand in the Innovate Hub.

Watch Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 3358 views
  • 1 like
  • 2 in conversation