Hi all,
I have a character variable ID is as follows :
ID
1002-004567-UC
1002-000062-UC
1002-239874-UC
I would like to transform the ID variable such as:
ID
4567
62
239874
What would be the appropriate syntax to use?
Thank you.
You could also use pattern matching
data ids;
input idStr :$16.;
datalines;
1002-004567-UC
1002-000062-UC
1002-239874-UC
;
data want;
set ids;
id = prxchange("s/.*-0*([1-9]\d*)-.*/\1/o", 1, idStr);
run;
proc print data=want noobs; run;
The pattern reads: Find any string, followed by a dash, followed by any number of zeros, ( followed by a digit between 1 and 9, followed by any number of digits ), followed by a dash, followed by anything. Keep the string part matched within the parentheses.
data have;
input ID $20.;
datalines;
1002-004567-UC
1002-000062-UC
1002-239874-UC
;
data want(drop=id rename=id1=id);
set have;
ID1=input(scan(ID,2),8.);
run;
Do you want the variable to be character or numeric?
The solution above generates a numeric ID. In general ID's should be character to avoid accidental mathematical issues and precision issues in merging.
SCAN() isolates the middle term
INPUT() converts to a number, so it removes the leading zero's
PUT() converts it back to a character, -l, left aligns the variable.
ID_VAR = put(input(scan(var, 2, "-"), 8.), 8. -l);
You could also use pattern matching
data ids;
input idStr :$16.;
datalines;
1002-004567-UC
1002-000062-UC
1002-239874-UC
;
data want;
set ids;
id = prxchange("s/.*-0*([1-9]\d*)-.*/\1/o", 1, idStr);
run;
proc print data=want noobs; run;
The pattern reads: Find any string, followed by a dash, followed by any number of zeros, ( followed by a digit between 1 and 9, followed by any number of digits ), followed by a dash, followed by anything. Keep the string part matched within the parentheses.
Replace the pattern with "s/.*-0*([1-9]\d*)\D.*/\1/o"
It would actually work also with your original question. It says that the number field ends when a non number character is found.
I guess you could just as well use "s/.*-0*([1-9]\d*).*/\1/o" which would work even if the number goes to the end of the string.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.