BookmarkSubscribeRSS Feed
Saurabh_Amar
Calcite | Level 5

Hi,

I have a flat file with ';' as a delimiter. The file is for a company at employee id level (unique row for each employee). Couple of variable (like education history) may have multiple values for each employee, which are delimited by '!' in my file. below is one example.

EMP_ID AGE NAME EDUCATION OCCUPATION

12345 ; 45 ; john smith ; "Doctorate Degree (over 19 years),5/1/2014 ! Masters's Degree (±18 years),1/1/2013 ! Bachelor's Degree (±16 years),1/1/2011" ; manager

23456 ; 30 ; smith john ; "Bachelor's Degree (±16 years),1/1/2003" ; analyst

for education field (which has ';' in starting and end) values in paranthesis indicate years in education and date represents when the degree was completed, separated by a comma.

there are other fields as well in the data but i have mentioned only few for example. I need to read the data in sas but a bit confused how to treat multiple delimiters in this case.

Thanks!

SA

3 REPLIES 3
gergely_batho
SAS Employee

You need to manually deal with the second delimiter. Using functions. Here is an example:

data want;

infile 'C:\temp\file.txt' dsd dlm=';';

input EMP_ID:$10. AGE NAME:$20. EDUCATION:$100. OCCUPATION:$10.;

do i=1 to countw(EDUCATION,'!');

  SingleEducation=scan(EDUCATION,i,'!');

  Degree=scan(SingleEducation,1,'(),');

  Years=scan(SingleEducation,2,'(),');

  Date=scan(SingleEducation,3,'(),');

  output;

end;

drop i EDUCATION;

run;

Additional excercise: convert Date to a real SAS date, set the length of Degree and Years columns.

Good luck!

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Personally I would read it in as ; delimited, then post process the dataset (haven't tested, but should be near enough):

data have;

     infile "xyz.txt" dlm=";" missover dsd lrecl=32767;

     input ...;

run;

data want;

     set have;

     array education_split{10} $200.;

     do I=1 to countc(education,"!")+1; /* Do at leats twice if ! occurs */

          education_split=scan(education,i,"!");

     end;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 827 views
  • 0 likes
  • 3 in conversation