BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
I have one variable (example listed below) needs to be parsed into multiple variables. For example, the first listed (F1) 172/233 needs to be separated as 3 variables: F1, 172 and 233. The last listed M(850)84 F(832)9/358 needs to be separated as 6 variables: M850, 84, F832, 9 and 358. The length is not fixed. Even I use index function to locate the ( ) it still hard to extract the characters within ( ). Any help will be highly appreciated. Thanks,


(F1) 172/233
(M20) 141
(M279) 256
(F125) 9/87
M(603)90 F(666)10/170
M(850)84 F(832)9/358
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
A combination of SCAN or INDEX (to determine length), along with the SUBSTR function is appropriate. Also, I see you will need to use COMPRESS to remove the parenthesis characters, possibly. Remember the SCAN function accepts one or more delimiter characters as the second argument; also scan accepts a negative value to work from the right-most string location - not well documented, I don't believe. And there are times where you may need to make multiple passes of the string/substring to extract the desired pieces.

Also, if you speak Perl, there are some supported Perl-like functions now supported with current SAS.

Scott Barry
SBBWorks, Inc.
Patrick
Opal | Level 21
Hi yy1234

Something like that:

data _null_;
infile datalines missover;
input String $50.;
put String=;
array StringPart {100} $100.;
i=1;
do while(scan(String,i,') /') ne '');
StringPart{i}=scan(String,i,') /');
StringPart{i}=compress(StringPart{i},'(');
put StringPart{i}=;
i+1;
end;
datalines;
(F1) 172/233
(M20) 141
(M279) 256
(F125) 9/87
M(603)90 F(666)10/170
M(850)84 F(832)9/358
;

You could also first pass the data set and determine the necessary number of array elements before reading the "words" into separate variables.
I've chosen 100 array elements.
What could work to do it more dynamic is: length() of string - length() of string after compression of all delimiters - and then keeping the max value of this difference (call symput).
Using this value (in macrovar) in array definition..... just an idea.

And yes: There would also be some Perl RegEx which could be helpful - but I think in your case they wouldn't provide a simpler solution.

Cheers
Patrick
deleted_user
Not applicable
Hi Scott & Patrick,
Thanks for all the suggestions. I tried the following. It works. gen_orig is the original variable I listed before.

gentry=tranwrd(gen_orig, "F(", "(F");
gentry2=tranwrd(gentry, "M(", "(M");
genvar1=scan(gentry2, 1 );
genvar2=scan(gentry2, 2);
genvar3=scan(gentry2, 3);
genvar4=scan(gentry2, 4);
genvar5=scan(gentry2, 5);

Best regards,
Ying

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 929 views
  • 0 likes
  • 3 in conversation