Hello,
This is my fist post, so please excuse any lack of etiquette. My question is how to remove the trailing zeros on a string. The following is my fictional dataset and below that is how I wish it would look after a datastep. Ignore the decimal point part of the code, since I need that for a subsequent step not related to dropping the trailing zeros. Also, the final dataset should be formatted as characters.
data icd9;
input id diag$;
length newdiag $6;
newdiag = substr(diag,1,3)||"."||
substr(diag,4,2);
datalines;
1 80500
2 80555
3 80550
4 e8500
5 e8550
6 e8555
7 v8500
8 v8550
9 v8555
10 08500
11 08550
12 08555
13 0414
14 041
15 04144
;
run;
Desired final product:
1 | 805 |
2 | 805.55 |
3 | 805.5 |
4 | e85 |
5 | e85.5 |
6 | e85.55 |
7 | v85 |
8 | v85.5 |
9 | v85.55 |
10 | 085 |
11 | 085.5 |
12 | 085.55 |
13 | 041.4 |
14 | 041 |
15 | 041.44 |
This look about right.
Message was edited by: data _null_
Thank you for the prompt post. Yes, this does seem to be what I was trying to achieve. Much thanks, in that this would have taken me a long time to figure out myself.
My error, I thought that I included all of the scenarios but forgot about the following:
Hello,
This is my fist post, so please excuse any lack of etiquette. My question is how to remove the trailing zeros on a string. The following is my fictional dataset and below that is how I wish it would look after a datastep. Ignore the decimal point part of the code, since I need that for a subsequent step not related to dropping the trailing zeros. Also, the final dataset should be formatted as characters.
data icd9;
input id diag$;
length newdiag $6;
newdiag = substr(diag,1,3)||"."||
substr(diag,4,2);
datalines;
1 80500
2 80555
3 80550
4 e8500
5 e8550
6 e8555
7 v8500
8 v8550
9 v8555
10 08500
11 08550
12 08555
13 0414
14 041
15 04144
16 80000
17 85000
18 04000
;
run;
Desired final product:
1 | 805 |
2 | 805.55 |
3 | 805.5 |
4 | e85 |
5 | e85.5 |
6 | e85.55 |
7 | v85 |
8 | v85.5 |
9 | v85.55 |
10 | 085 |
11 | 085.5 |
12 | 085.55 |
13 | 041.4 |
14 | 041 |
15 | 041.44 |
16 | 800 |
17 | 850 |
18 | 040 |
Another approach using Prxchange:
data icd9;
input id diag$;
length _new $6;
_new=catx('.',substr(diag,1,3),prxchange('s/([^0]?)(0+$)/$1/io',-1,compress(substr(diag,4))));
datalines;
1 80500
2 80000
3 80550
4 e8500
5 e8550
6 e8555
7 v8500
8 v8550
9 v40
10 08500
11 08550
12 08555
13 0414
14 041
15 04144
;
run;
Haikuo
I prefer data null's solution - but here another variant using a bit a simpler RegEx:
data icd9;
input id diag :$5.;
length newdiag $ 6;
newdiag=prxchange('s/0{1,2} *$//oi',1,diag);
if length(newdiag)>3 then
newdiag = substrn(newdiag,1,3)||"."||substrn(newdiag,4);
datalines;
1 80500
2 80555
3 80550
4 e8500
5 e8550
6 e8555
7 v8500
8 v8550
9 v8555
10 08500
11 08550
12 08555
13 0414
14 041
15 04144
;
run;
What does the 'TBK' do,data_null_?
thanks
Thank you everyone. All three options worked great and are very helpful.
I'm glad you found some solutions, H, and your etiquette is right on. Thanks for marking "Correct" and "Helpful" answers. Those are most useful for others who may be searching for answers to a similar problem.
Anna
Well since I received such expeditious and straightforward help, I am going to add hopefully one last twist. Since this seems to remain in the same content area, I won't opt to make this as a new thread. My question is as follows:
How would I best repeat this for for multiple variables/columns. I have diag1-diag60, and I would like all 60 variables transformed as discussed above? I could obviously do each one independently or in one big swoop. Though, I have not quite figured out how to do it with the multiple statements and variable names. Any help would be greatly appreciated.
Using an array should help, something like the following:
data icd9;
input id (diag1-diag4) (:$);
array diag diag1-diag4;
array _new $8 _new1-_new4;
do over _new;
_new=catx('.',substr(diag,1,3),prxchange('s/([^0]?)(0+$)/$1/io',-1,compress(substr(diag,4))));
end;
datalines;
1 80500 80500 80500 80500
2 80000 80000 80000 80000
3 80550 80550 80550 80550
4 e8500 e8500 e8500 e8500
5 e8550 e8550 e8550 e8550
6 e8555 e8555 e8555 e8555
7 v8500 v8500 v8500 v8500
8 v8550 v8550 v8550 v8550
9 v40 v40 v40 v40
10 08500 08500 08500 08500
11 08550 08550 08550 08550
12 08555 08555 08555 08555
13 0414 0414 0414 0414
14 041 041 041 041
15 04144 04144 04144 04144
;
run;
Haikuo
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.