Hi,
I have a flat file with ';' as a delimiter. The file is for a company at employee id level (unique row for each employee). Couple of variable (like education history) may have multiple values for each employee, which are delimited by '!' in my file. below is one example.
EMP_ID AGE NAME EDUCATION OCCUPATION
12345 ; 45 ; john smith ; "Doctorate Degree (over 19 years),5/1/2014 ! Masters's Degree (±18 years),1/1/2013 ! Bachelor's Degree (±16 years),1/1/2011" ; manager
23456 ; 30 ; smith john ; "Bachelor's Degree (±16 years),1/1/2003" ; analyst
for education field (which has ';' in starting and end) values in paranthesis indicate years in education and date represents when the degree was completed, separated by a comma.
there are other fields as well in the data but i have mentioned only few for example. I need to read the data in sas but a bit confused how to treat multiple delimiters in this case.
Thanks!
SA
You need to manually deal with the second delimiter. Using functions. Here is an example:
data want;
infile 'C:\temp\file.txt' dsd dlm=';';
input EMP_ID:$10. AGE NAME:$20. EDUCATION:$100. OCCUPATION:$10.;
do i=1 to countw(EDUCATION,'!');
SingleEducation=scan(EDUCATION,i,'!');
Degree=scan(SingleEducation,1,'(),');
Years=scan(SingleEducation,2,'(),');
Date=scan(SingleEducation,3,'(),');
output;
end;
drop i EDUCATION;
run;
Additional excercise: convert Date to a real SAS date, set the length of Degree and Years columns.
Good luck!
Personally I would read it in as ; delimited, then post process the dataset (haven't tested, but should be near enough):
data have;
infile "xyz.txt" dlm=";" missover dsd lrecl=32767;
input ...;
run;
data want;
set have;
array education_split{10} $200.;
do I=1 to countc(education,"!")+1; /* Do at leats twice if ! occurs */
education_split=scan(education,i,"!");
end;
run;
Thanks Gergely and RW9..will try this..
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.