BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
I have one variable (example listed below) needs to be parsed into multiple variables. For example, the first listed (F1) 172/233 needs to be separated as 3 variables: F1, 172 and 233. The last listed M(850)84 F(832)9/358 needs to be separated as 6 variables: M850, 84, F832, 9 and 358. The length is not fixed. Even I use index function to locate the ( ) it still hard to extract the characters within ( ). Any help will be highly appreciated. Thanks,


(F1) 172/233
(M20) 141
(M279) 256
(F125) 9/87
M(603)90 F(666)10/170
M(850)84 F(832)9/358
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
A combination of SCAN or INDEX (to determine length), along with the SUBSTR function is appropriate. Also, I see you will need to use COMPRESS to remove the parenthesis characters, possibly. Remember the SCAN function accepts one or more delimiter characters as the second argument; also scan accepts a negative value to work from the right-most string location - not well documented, I don't believe. And there are times where you may need to make multiple passes of the string/substring to extract the desired pieces.

Also, if you speak Perl, there are some supported Perl-like functions now supported with current SAS.

Scott Barry
SBBWorks, Inc.
Patrick
Opal | Level 21
Hi yy1234

Something like that:

data _null_;
infile datalines missover;
input String $50.;
put String=;
array StringPart {100} $100.;
i=1;
do while(scan(String,i,') /') ne '');
StringPart{i}=scan(String,i,') /');
StringPart{i}=compress(StringPart{i},'(');
put StringPart{i}=;
i+1;
end;
datalines;
(F1) 172/233
(M20) 141
(M279) 256
(F125) 9/87
M(603)90 F(666)10/170
M(850)84 F(832)9/358
;

You could also first pass the data set and determine the necessary number of array elements before reading the "words" into separate variables.
I've chosen 100 array elements.
What could work to do it more dynamic is: length() of string - length() of string after compression of all delimiters - and then keeping the max value of this difference (call symput).
Using this value (in macrovar) in array definition..... just an idea.

And yes: There would also be some Perl RegEx which could be helpful - but I think in your case they wouldn't provide a simpler solution.

Cheers
Patrick
deleted_user
Not applicable
Hi Scott & Patrick,
Thanks for all the suggestions. I tried the following. It works. gen_orig is the original variable I listed before.

gentry=tranwrd(gen_orig, "F(", "(F");
gentry2=tranwrd(gentry, "M(", "(M");
genvar1=scan(gentry2, 1 );
genvar2=scan(gentry2, 2);
genvar3=scan(gentry2, 3);
genvar4=scan(gentry2, 4);
genvar5=scan(gentry2, 5);

Best regards,
Ying

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 956 views
  • 0 likes
  • 3 in conversation