Hi,
I have a dataset where the only variable is a long string delimited by "|"
An example of a record would be:
A||032345|99|||
I've tried some solutions that I've encountered online, but the problem is when you encounter the delimiters without anything between them. They just end up being ignored. Here's an example of code I've used.
data want (drop=i); set have;
array parsed_vars $ new_var1-new_var%eval(&maxelements);
do i = 1 to &maxelements;
parsed_vars{i} = scan(orig_var,i,'.');
end;
run;
I'd like the parsing output to appear as
orig_var | new_var1 | new_var2 | new_var3 | new_var4 | new_var5 | new_var6 |
A||032345|99||| | A | . | 032345 | 99 | . | . |
However, the output appears as:
orig_var | new_var1 | new_var2 | new_var3 | new_var4 | new_var5 | new_var6 |
A||032345|99||| | A | 032345 | 99 | . | . | . |
I'm not sure how to fix it and would appreciate any help. If possible, I'd prefer to use proc sql if it can be done that way.
Thanks
Do please RTM. Scan() has lots of options.
data test;
ori='A||032345|99|||';
array var(6) $10;
do i=1 to 6;
var(i)=scan(ori,i,'|','m');
end;
run;
Do please RTM. Scan() has lots of options.
data test;
ori='A||032345|99|||';
array var(6) $10;
do i=1 to 6;
var(i)=scan(ori,i,'|','m');
end;
run;
I would likely go back to the original file, probably use Proc Import with delimiter='|' getnames=No and guessingrows=max.
The default out of import will be to treat two delimiters to indicate a missing value. I believe the variable names would be var1-varn
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.