Hi Everyone,
kindly help me to read the following raw data into sas. Thanks in advance.
AronM374856Texas
BrandonM847584California
JaneF856747Huston
ClarindaF748574Newyork
ArnoldM435867Huston
Hi:
What code have you tried? It is hard to figure out where the variables start and stop, I suppose you could read in each line and parse out the variables from the "back end". But what type of file is this? Is this a file without any delimiters? What is the front end process that creates this file? Is this a 1 time read process or do you have to read a file like this on a periodic basis?
cynthia
Hi Cynthia,
This is an unformatted data without delimeter And it is for one time process. I have tried all basic styles of reading taw data but that didnt work. Kindly solve this one if there are any advanced techniques.
STolin
Not pretty but you can do it using combinations of scan/compress/reverse/substr functions.
Regular Expressions would also work, but I avoid them like the plague.
data have;
informat text $256.;
input text $;
first_part=scan(text, 1, ,'d');
Name=substr(first_part, 1, length(first_part)-1);
Gender=substr(reverse(trim(first_part)),1,1);
RandomNumber=compress(text, , 'kd');
State=scan(text, 2, ,'d');
cards;
AronM374856Texas
BrandonM847584California
JaneF856747Huston
ClarindaF748574Newyork
ArnoldM435867Huston
;
run;
proc print data=have;
run;
Hi Reeza,
Thank you for your helpful code with which i got the solution. But still confused about how exactly does k & d delimiters works?
:
They are called function modifier you can find a list of them in the SAS function reference for each function. for example SCAN Function
This blog give good examples COMPRESS: SAS Function strips characters from the string of using them and combine them.
data have;
informat text $256.;
input text $;
cards;
AronM374856Texas
BrandonM847584California
JaneF856747Huston
ClarindaF748574Newyork
ArnoldM435867Huston
;
run;
data want;
set have;
length Name Gender RandomNumber State$ 100;
re=prxparse('/(\w+)([F|M])(\d+)(\w+)/o');
if prxmatch(re,text) then do;
Name=prxposn(re,1,text);
Gender=prxposn(re,2,text);
RandomNumber=prxposn(re,3,text);
State=prxposn(re,4,text);
end;
drop re;
run;
Xia Keshan
When I ran your code I could see the variable 're' returns the value '1' for all the records. May I request you to explain how your prxparse function works here?
It is a Perl Regular Expression, It is matched a pattern of your data ,Since all of obs matched such pattern , RE is naturally returning 1 all time. If you are hard to understand it ,then use Reeza's code , Hers is better understood .
Xia Keshan
I wish to understand your Perl Regular Expression, although it is difficult:)
OK. The good knowledge is SAS documentation . You'd refer to it if you want more . For your example :
(\w+)([F|M])(\d+)(\w+)
\w+ matched one or more words character(i.e. 0-9 a-z A-Z _ )
[F|M] matched F or M
\d+ matched one or more digits (i.e. 0-9 )
Therefore this Perl Regular Expression is trying to match your string ,in other words , your string have such pattern .
Xia Keshan
Whilst Xia Keshan has provided a great solution, I would ask why your data looks like that in the first place. You could face all kinds of issues if you leave it as such, for instance what will happen with missing data, say sex is missing, how will you handle:
Aron374856Texas?
I would suggest you need to return to the source of the data and fix that so the data is clear.
data foo;
infile cards length=len;
input @;
_1=findc(_infile_, 'MF', 2)-1;
_2=len-_1-6;
input name $varying32. _1 gender $1. num 6. state $varying32. _2;
cards;
AronM374856Texas
BrandonM847584California
JaneF856747Huston
ClarindaF748574Newyork
ArnoldM435867Huston
;
run;
makes a good point here, and you should consider his suggestion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.