Newbbie needs help: How to retrieve the first few digits from a numeric variable in numeric format?
For example, I have a numeric variable containing a number 123456789. I want to retrieve the first 4 digits 1234 into a new numeric variable. Of course I can put the numeric variable into character, then use substr, then later I can convert character back into numeric.
I wonder if there is a more easier and straight forward method to do it. Thanks.
Editors note: Thanks to RW9 and all the others that provided different ways to retrieve N number of characters from a value. RW9 is correct in that we must convert to character to achieve this goal. For example:
data one;
val=1234567;
val=substr(val,1,4);
run;
This will keep the original variable as numeric but an implied conversion from numeric to character must take place.
Nope, numbers are not strings, hence you cannot do string functions on them without converting to character (even if the conversion is implicit). You can utilize maths to give the same effect per above, however the underlying storage of a string (which is an array of characters) is very different to a number.
If it is that simple then basic maths, divide by the number of digits you want to remove then int the result:
data want;
a=123456789;
b=int(a/100000);
run;
Thanks for your help. But my point actually is, for character variables, it is very simple to retrieve the first 4 char using substr. What I am wondering is, is there any simple method to use to in a similar way for char variables, without having to convert numeric into char first, then using substr, then convert back to numerical again.
Editors note: Thanks to RW9 and all the others that provided different ways to retrieve N number of characters from a value. RW9 is correct in that we must convert to character to achieve this goal. For example:
data one;
val=1234567;
val=substr(val,1,4);
run;
This will keep the original variable as numeric but an implied conversion from numeric to character must take place.
Nope, numbers are not strings, hence you cannot do string functions on them without converting to character (even if the conversion is implicit). You can utilize maths to give the same effect per above, however the underlying storage of a string (which is an array of characters) is very different to a number.
Thanks a lot. Makes sense...
I don't think this is the correct answer. That will not extract first 4 values. rather then it will give you blank value.
the one way which I can think off. we can use it as below.
data one;
val=1234567;
val1=substr(left(val),1,4);
run;
Assuming they are all positive value.
data _null_; x=123456789; y=int( divide(x,10**(int(log10(x))-3)) ); put x= y= ; run;
Xia Keshan
Newvar = input(substr(put(var,best12.),1,4),best12.);
DATA HAVE;
INPUT NUM;
DATALINES;
1234567
2345
456676
234335
;
RUN;
DATA WANT;
SET HAVE;
FIN=SUBSTRN(NUM,2,4);
RUN;
Still an implicit conversion from numeric to character done in that function.
SAS(R) 9.2 Language Reference: Dictionary, Fourth Edition
Got it Thank you.
Just an edit to what I posted,
DATA HAVE;
INPUT NUM;
DATALINES;
1234567
2345
456676
234335
;
RUN;
DATA WANT;
SET HAVE;
FIN=SUBSTRN(NUM,2,4)*1;
RUN;
PROC CONTENTS DATA=WANT;
RUN;
Wow!!
Actually two implicit conversions there. The substrn internally converts numbers to character:
---
string
specifies a character or numeric constant, variable, or expression.
If string is numeric, then it is converted to a character value that uses the BEST32. format. Leading and trailing blanks are removed, and no message is sent to the SAS log.
---
Then by invoking the * mathematical symbol it attempts to convert characters to numbers and multiply the result. It may work, but you will get issues if the character start to contain odd things (say -12345.67 for example). And also note the note put into the log:
---
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
---
Would be unacceptable to my log checking.
Depends really on what you want the "first 4 digits" for, I would recommend explicit casting of the data - it maybe a step or two more, but is far more readable.
Here is a solution that avoids number to character conversion and back again, and also deals with fractional and negative values
int(abs(num)/10**(log10(abs(num))-3))
It works by dividing the number by the requisite power of 10 (including negative power) and truncating the decimal portion.
Richard
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.