Hello,
I have a variable in my dataset that I need to scrub. The data has special characters that need to be scrubbed out and a carriage return added to make it appear like a list in the variable field. This data will be pulled into a spreadsheet this way.
For example the values could look like this:
Old Data
Corporate (ABC);#1;#Little Rock Branch;#21 |
Iowa Office;#5;#Des Moines (FCO);#44 |
New Data
Corporate (ABC) Little Rock Branch |
Iowa Office Des Moines (FCO) |
I am hoping there might be a function that can handle this. Thanks!
Looks like the strings are using the two byte sequence ;# as the delimiter.
And your examples make it look like you just want to keep every other item from the delimited list.
data want;
set have;
length copy new_data $200 ;
copy=tranwrd(old_data,';#','01'x);
do index=1 by 2 to countw(copy,'01'x,'m');
new_data=catx('0d'x,new_data,scan(copy,index,'01'x,'m'));
end;
drop copy index ;
run;
But is not clear why you think inserting a carriage return character will make it look like multiple lines. How are you LOOKing at the values? If you are generating standard old text listing files then PROC REPORT will honor the SPLIT= option.
data have;
row+1;
length old_data $200;
input;
old_data=_infile_;
cards4;
Corporate (ABC);#1;#Little Rock Branch;#21
Iowa Office;#5;#Des Moines (FCO);#44
;;;;
data want;
set have;
length copy new_data $200 ;
copy=tranwrd(old_data,';#','01'x);
do index=1 by 2 to countw(copy,'01'x,'m');
new_data=catx('|',new_data,scan(copy,index,'01'x,'m'));
end;
drop copy index ;
run;
proc report data=want split='|';
column row new_data ;
define row / order ;
define new_data / width=100 flow ;
run;
Result:
row new_data 1 Corporate (ABC) Little Rock Branch 2 Iowa Office Des Moines (FCO)
But someone that is better at getting ODS to do anything useful will need explain how you could create print-outs using ODS that honor you line breaks.
Try this
data old;
input string $ 1 - 50;
datalines4;
Corporate (ABC);#1;#Little Rock Branch;#21
Iowa Office;#5;#Des Moines (FCO);#44
;;;;
data new(keep = newstring);
set old;
do i = 1 to countw(string, ';');
word = scan(string, i, ';');
if prxmatch('/#\d+/', word) = 0 then do;
newstring = compress(word, '#');
output;
end;
end;
run;
Which operating system is the target? The carriage return character is different in ASCII or EBCDIC text.
How will the data be "pulled into the spreadsheet"?
After a bit more research, I was able to get a nested TRANWRD function to work.
tranwrd(tranwrd(compress(olddata,'','d'),';#;#','0a'x),';#','') as newdata
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.