Solved
Contributor
Posts: 36

# Split a string with different size

Hi,

I wonder if anyone can help me on this issue. I need a code to split a string into two different string and then merge them again. I am working on a Compustat variable which is about zipcode. However, a lot of zipcode is incorrect (I guess). I want 5 digit zipcode (character) while some values have only 4 or 3 characters. My strategy is to slip values into two part: first part has left-most two digits, and second part has the rest. For second part, I need to add missing "0" (add 2 zeros if second part has only 1 character; and add 1 zero if second part has 2 character). Then I need to add first part and second part together to have full zipcode.

The code should be generalised as most of zipcodes are correct with 5 digits (of course, I want to keep those zipcodes as they are).

Here is an example.

Have   Want

0157    01057

427     42007

Thank you very much!

Best regards,

Thierry

Accepted Solutions
Solution
‎05-08-2018 06:58 PM
PROC Star
Posts: 1,831

## Re: Split a string with different size

[ Edited ]
`````` data have;
input var \$;
cards;
0157
427
;

data want;
set have;
if length(strip(var)) ne 5  then need=cats(substr(var,1,2),put(input(substr(var,3),8.),z3.));
run;``````

All Replies
Solution
‎05-08-2018 06:58 PM
PROC Star
Posts: 1,831

## Re: Split a string with different size

[ Edited ]
`````` data have;
input var \$;
cards;
0157
427
;

data want;
set have;
if length(strip(var)) ne 5  then need=cats(substr(var,1,2),put(input(substr(var,3),8.),z3.));
run;``````
Contributor
Posts: 36

## Re: Split a string with different size

Thank you very much! It really works!

Super User
Posts: 13,583

## Re: Split a string with different size

An example creating a new zip variable so you can verify that the result is as needed looking at your old zip.

Note that the result for two characters is going to be wrong as you did not provide a rule for that case.

```data dummy;
input zip \$;
if length(zip) < 5 then newzip=cats(substr(zip,1,2),put(input(substr(left(zip),3),best.),z3.));
else newzip=zip;
datalines;
0157
427
12345
;
run;```

I used <5 in case your actual data has Zip+4 values in it.

If this works for your data you could replace the Newzip variable in the IF statement with Zip and not need the else.

Contributor
Posts: 36

## Re: Split a string with different size

Thank you very much! Yes, zipcodes may have 4 digits at the end, but I dont need it. But the code is helpful.
☑ This topic is solved.