DATA Step, Macro, Functions and more

Split a string with different size

Accepted Solution Solved
Reply
Contributor
Posts: 36
Accepted Solution

Split a string with different size

Hi, 

 

I wonder if anyone can help me on this issue. I need a code to split a string into two different string and then merge them again. I am working on a Compustat variable which is about zipcode. However, a lot of zipcode is incorrect (I guess). I want 5 digit zipcode (character) while some values have only 4 or 3 characters. My strategy is to slip values into two part: first part has left-most two digits, and second part has the rest. For second part, I need to add missing "0" (add 2 zeros if second part has only 1 character; and add 1 zero if second part has 2 character). Then I need to add first part and second part together to have full zipcode.

 

The code should be generalised as most of zipcodes are correct with 5 digits (of course, I want to keep those zipcodes as they are).

 

Here is an example.

 

Have   Want

0157    01057

427     42007

 

 

Thank you very much!

Best regards,

Thierry


Accepted Solutions
Solution
‎05-08-2018 06:58 PM
PROC Star
Posts: 1,831

Re: Split a string with different size

[ Edited ]
Posted in reply to tritringuyen
 data have;
 input var $;
 cards;
0157    
427     
;

data want;
set have;
if length(strip(var)) ne 5  then need=cats(substr(var,1,2),put(input(substr(var,3),8.),z3.));
run;

View solution in original post


All Replies
Solution
‎05-08-2018 06:58 PM
PROC Star
Posts: 1,831

Re: Split a string with different size

[ Edited ]
Posted in reply to tritringuyen
 data have;
 input var $;
 cards;
0157    
427     
;

data want;
set have;
if length(strip(var)) ne 5  then need=cats(substr(var,1,2),put(input(substr(var,3),8.),z3.));
run;
Contributor
Posts: 36

Re: Split a string with different size

Posted in reply to novinosrin

Thank you very much! It really works!

Super User
Posts: 13,583

Re: Split a string with different size

Posted in reply to tritringuyen

An example creating a new zip variable so you can verify that the result is as needed looking at your old zip.

 

Note that the result for two characters is going to be wrong as you did not provide a rule for that case.

data dummy;
   input zip $;
   if length(zip) < 5 then newzip=cats(substr(zip,1,2),put(input(substr(left(zip),3),best.),z3.));
   else newzip=zip;
datalines;
0157
427
12345
;
run;

I used <5 in case your actual data has Zip+4 values in it.

 

If this works for your data you could replace the Newzip variable in the IF statement with Zip and not need the else.

Contributor
Posts: 36

Re: Split a string with different size

Thank you very much! Yes, zipcodes may have 4 digits at the end, but I dont need it. But the code is helpful.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 150 views
  • 1 like
  • 3 in conversation