Solved: Re: Removing trailing zeros in string

Report Inappropriate Content · Posted 11-08-2013 04:51 PM

Hello,

This is my fist post, so please excuse any lack of etiquette. My question is how to remove the trailing zeros on a string. The following is my fictional dataset and below that is how I wish it would look after a datastep. Ignore the decimal point part of the code, since I need that for a subsequent step not related to dropping the trailing zeros. Also, the final dataset should be formatted as characters.

data icd9;

input id diag$;

length newdiag $6;

newdiag = substr(diag,1,3)||"."||

substr(diag,4,2);

datalines;

1 80500

2 80555

3 80550

4 e8500

5 e8550

6 e8555

7 v8500

8 v8550

9 v8555

10 08500

11 08550

12 08555

13 0414

14 041

15 04144

;

run;

Desired final product:

1	805
2	805.55
3	805.5
4	e85
5	e85.5
6	e85.55
7	v85
8	v85.5
9	v85.55
10	085
11	085.5
12	085.55
13	041.4
14	041
15	041.44

data_null__ · Posted 11-08-2013 06:18 PM

newdiag = substrn(diag,1,max(3,f));

View solution in original post

data_null__ · Posted 11-08-2013 05:35 PM

This look about right.

data icd9;
   input id diag$;
   length newdiag $6; 
   f = findc(diag,'0','TBK');
   newdiag = substrn(diag,1,max(3,f));
   newdiag = catx('.',substrn(newdiag,1,3),substrn(newdiag,4));
   datalines; 
     1     80500
     2     80555
     3     80550
     4     e8500
     5     e8550
     6     e8555
     7     v8500
     8     v8550
     9     v8555
     10     08500
     11     08550
     12     08555
     13     0414
     14     041    
     15     04144
     ;
   run; 
proc print; 
   run;

Message was edited by: data _null_

Report Inappropriate Content · Posted 11-08-2013 05:44 PM

Thank you for the prompt post. Yes, this does seem to be what I was trying to achieve. Much thanks, in that this would have taken me a long time to figure out myself.

Report Inappropriate Content · Posted 11-08-2013 05:56 PM

My error, I thought that I included all of the scenarios but forgot about the following:

Hello,

This is my fist post, so please excuse any lack of etiquette. My question is how to remove the trailing zeros on a string. The following is my fictional dataset and below that is how I wish it would look after a datastep. Ignore the decimal point part of the code, since I need that for a subsequent step not related to dropping the trailing zeros. Also, the final dataset should be formatted as characters.

data icd9;

input id diag$;

length newdiag $6;

newdiag = substr(diag,1,3)||"."||

substr(diag,4,2);

datalines;

1 80500

2 80555

3 80550

4 e8500

5 e8550

6 e8555

7 v8500

8 v8550

9 v8555

10 08500

11 08550

12 08555

13 0414

14 041

15 04144

16 80000

17 85000

18 04000

;

run;

Desired final product:

1	805
2	805.55
3	805.5
4	e85
5	e85.5
6	e85.55
7	v85
8	v85.5
9	v85.55
10	085
11	085.5
12	085.55
13	041.4
14	041
15	041.44
16	800
17	850
18	040

data_null__ · Posted 11-08-2013 06:18 PM

newdiag = substrn(diag,1,max(3,f));

Haikuo · Posted 11-08-2013 08:49 PM

Another approach using Prxchange:

data icd9;

input id diag$;

length _new $6;

_new=catx('.',substr(diag,1,3),prxchange('s/([^0]?)(0+$)/$1/io',-1,compress(substr(diag,4))));

datalines;

1 80500

2 80000

3 80550

4 e8500

5 e8550

6 e8555

7 v8500

8 v8550

9 v40

10 08500

11 08550

12 08555

13 0414

14 041

15 04144

;

run;

Haikuo

Patrick · Posted 11-09-2013 12:37 AM

I prefer data null's solution - but here another variant using a bit a simpler RegEx:

data icd9;
input id diag :$5.;
length newdiag $ 6;
newdiag=prxchange('s/0{1,2} *$//oi',1,diag);
if length(newdiag)>3 then
newdiag = substrn(newdiag,1,3)||"."||substrn(newdiag,4);
datalines;
1 80500
2 80555
3 80550
4 e8500
5 e8550
6 e8555
7 v8500
8 v8550
9 v8555
10 08500
11 08550
12 08555
13 0414
14 041
15 04144
;
run;

Tal · Posted 11-10-2013 02:21 PM

What does the 'TBK' do,data_null_?

data_null__ · Posted 11-10-2013 03:31 PM

SAS(R) 9.3 Functions and CALL Routines: Reference

Tal · Posted 11-10-2013 04:42 PM

thanks

Report Inappropriate Content · Posted 11-11-2013 09:59 AM

Thank you everyone. All three options worked great and are very helpful.

AnnaBrown · Posted 11-11-2013 11:02 AM

I'm glad you found some solutions, H, and your etiquette is right on. Thanks for marking "Correct" and "Helpful" answers. Those are most useful for others who may be searching for answers to a similar problem.

Anna

Access SAS Innovate on-demand content now!

Report Inappropriate Content · Posted 11-11-2013 04:22 PM

Well since I received such expeditious and straightforward help, I am going to add hopefully one last twist. Since this seems to remain in the same content area, I won't opt to make this as a new thread. My question is as follows:

How would I best repeat this for for multiple variables/columns. I have diag1-diag60, and I would like all 60 variables transformed as discussed above? I could obviously do each one independently or in one big swoop. Though, I have not quite figured out how to do it with the multiple statements and variable names. Any help would be greatly appreciated.

Haikuo · Posted 11-11-2013 04:48 PM

Using an array should help, something like the following:

data icd9;

input id (diag1-diag4) (:$);

array diag diag1-diag4;

array _new $8 _new1-_new4;

do over _new;

_new=catx('.',substr(diag,1,3),prxchange('s/([^0]?)(0+$)/$1/io',-1,compress(substr(diag,4))));

end;

datalines;

1 80500 80500 80500 80500

2 80000 80000 80000 80000

3 80550 80550 80550 80550

4 e8500 e8500 e8500 e8500

5 e8550 e8550 e8550 e8550

6 e8555 e8555 e8555 e8555

7 v8500 v8500 v8500 v8500

8 v8550 v8550 v8550 v8550

9 v40 v40 v40 v40

10 08500 08500 08500 08500

11 08550 08550 08550 08550

12 08555 08555 08555 08555

13 0414 0414 0414 0414

14 041 041 041 041

15 04144 04144 04144 04144

;

run;

Haikuo

Registration is open