Hello,
Can anyone help to achieve below :
I am getting input values as comma separated values and again comma in the values as nested comma. I want to make multiple rows based on comma separated and not on based nested comma values.
Input | Required |
Business Meeting - Involved, MTNG - Meals > 5,000XYZ, XYZ & ABC > 5,000XYZ, XYZ - Meal > 5,000XYZ | Business Meeting - Involved |
MTNG - Meals > 5,000XYZ | |
XYZ & ABC > 5,000XYZ | |
XYZ - Meal > 5,000XYZ |
Remove the comma's from the numbers is the easiest solution.
/* Sample input data */
data have;
input text $100.;
datalines;
Business Meeting - Involved, MTNG - Meals > 5,000XYZ, XYZ & ABC > 5,000XYZ, XYZ - Meal > 5,000XYZ
;
data want;
set have;
*remove commas in numbers;
new_text=prxchange('s/(\d+),(\d+)/$1$2/', -1, text);
*count number of terms;
nterms=countc(new_text, ",");
*expand out;
do i=1 to nterms+1;
term=scan(new_text, i, ",");
output;
end;
run;
/* View the output */
proc print data=want;
run;
The surest way is to fix the process that generated the values so that it makes something that can be scanned.
Then you can use the Q option on the SCAN() function to parse the string.
Example of the second option:
data have;
string='Business Meeting - Involved,"MTNG - Meals > 5,000XYZ","XYZ & ABC > 5,000XYZ","XYZ - Meal > 5,000XYZ"';
run;
data want;
set have;
do index=1 to countw(string,',','q');
length value $40 ;
value = dequote(scan(string,index,',','q'));
output;
end;
run;
Obs index value 1 1 Business Meeting - Involved 2 2 MTNG - Meals > 5,000XYZ 3 3 XYZ & ABC > 5,000XYZ 4 4 XYZ - Meal > 5,000XYZ
If looks like you are actually using comma and space as the delimiter. So perhaps you can split on those.
data have;
string='Business Meeting - Involved, MTNG - Meals > 5,000XYZ, XYZ & ABC > 5,000XYZ, XYZ - Meal > 5,000XYZ';
run;
data want;
set have;
loc=1;
do index=1 by 1 until(nextloc=0);
nextloc=find(string,', ',loc);
length value $40 ;
if nextloc then value=substrn(string,loc,nextloc-loc-1);
else value=substrn(string,loc);
output;
loc=nextloc+2;
end;
run;
proc print;
run;
Otherwise you might try to remove (or replace) the commas that appear between digits. Perhaps use regular expressions. And then you can scan using comma as the delimiter.
Remove the comma's from the numbers is the easiest solution.
/* Sample input data */
data have;
input text $100.;
datalines;
Business Meeting - Involved, MTNG - Meals > 5,000XYZ, XYZ & ABC > 5,000XYZ, XYZ - Meal > 5,000XYZ
;
data want;
set have;
*remove commas in numbers;
new_text=prxchange('s/(\d+),(\d+)/$1$2/', -1, text);
*count number of terms;
nterms=countc(new_text, ",");
*expand out;
do i=1 to nterms+1;
term=scan(new_text, i, ",");
output;
end;
run;
/* View the output */
proc print data=want;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.