Hello! I am trying to remove certain characters from unformated strings. What I mean by "unformated" is these variables have discrepancies so that a simple compress or substr won't work (or at least I don't think they would).
I have the following column from an xlsx file:
Here's what they look like ........ and here's what I'm interested in
1X4F--0754 ................................. 0754
1X4F-0016 .................................. 0016
1X4F-0023 .................................. 0023
1X4F-2008-01 ............................. 2008
1X4F-2008-02 ............................. ^
1X4F-330 .................................... 3300
Note that the first value had double dashes. This was entered manually and is a mistake. Also note that 2008-01 and 2008-02 are the same and should be one value. The last value is incomplete.
Is there a way for me to reshape this?
Thank you in advance,
Yawen
The double dashes can be dealt with via prxchange. The duplicate values could be dealt with with nodupkey sort, but it would depend on the rest of your data.
data have;
input str :$12.;
str2 = scan(prxchange('s/--/-/', -1, str), 2, '-');
datalines;
1X4F--0754
1X4F-0016
1X4F-0023
;
run;
The double dashes can be dealt with via prxchange. The duplicate values could be dealt with with nodupkey sort, but it would depend on the rest of your data.
data have;
input str :$12.;
str2 = scan(prxchange('s/--/-/', -1, str), 2, '-');
datalines;
1X4F--0754
1X4F-0016
1X4F-0023
;
run;
You can look at the SCAN() function as well. The modifiers will help.
This may get you started:
data example; infile datalines truncover; informat x $13.; input x; length y $ 4; y= scan(x,2,'-',); datalines; 1X4F--0754 1X4F-0016 1X4F-0023 1X4F-2008-01 1X4F-2008-02 1X4F-330 ; run;
If the "incomplete" is always handled the same way you could add a test for
if length(y) < 4 then y=catt(y,'0'); if the action needed is to always place a 0 at the end. But that was not specified by your request, you just said is was incomplete.
You guys were all correct to suggest the scan function! It totally fixed my problem. Thank you all!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.