BookmarkSubscribeRSS Feed
vak1987
Calcite | Level 5

Can someone please help me on how to read the below text file format to SAS?

 

John
Sex=M
Age=31
Country=US
Allergy=False
Mark
Sex=M
Age=32
Country=US
Allergy=True
Allergy1
Allergy2
Allergy3
Chase
Sex=F
Age=34
Country=UK
Allergy=False

 

I tried simple infile statement and using / but that works if data is consistent. In my case the pattern changes slightly if Allergy field is true. Appreciate your help.

7 REPLIES 7
data_null__
Jade | Level 19
Are there any blank lines that you are not showing.
vak1987
Calcite | Level 5

Nope, text file does not have any blank lines.

data_null__
Jade | Level 19

You need a way to figure out how many allergies to read or to read until you find the next NAME.  This assumes that each allergy begins with ALLERGY.  Show data this is more realistic to your actual data.

 

filename FT15F001 temp;
data test;
   retain;
   infile FT15F001 length=l;
   length Name $32 Sex $1 Age 8 Country $8 Allergy $5;
   input name / sex= / age= / country= / allergy=;
   if first(allergy) eq 'T' then do;
      input atype:&$32. @; 
      do while(atype eq: 'Allergy');
         output;
         input / atype:&$32. @1 @@;
         end;
      call missing(atype);
      end;
   else output;
   return;
   parmcards;
John
Sex=M
Age=31
Country=US
Allergy=False
Mark
Sex=M
Age=32
Country=US
Allergy=True
Allergy1
Allergy2
Allergy3
Chase
Sex=F
Age=34
Country=UK
Allergy=False
;;;;
   run;

Capture.PNG

ballardw
Super User

Is there any rule when this file was built that determines that your (likely) name is separated from the last allergy?

Is there a limit on the number of allergies per person?

 

Please show what a SAS data set from this example should look like after it is read.

Ksharp
Super User
Plz post the output you are looking for .
Tom
Super User Tom
Super User

Assuming that you always have NAME/SEX/AGE/COUNTRY/ALLERGY for each subject here is one way.

Read the file once looking for lines that start with SEX= to find the number of subjects and where they start.

Then on reading the second time you can know how many lines to read for the allergans list.

filename text temp;
options parmcards=text;
parmcards;
John
Sex=M
Age=31
Country=US
Allergy=False
Mark
Sex=M
Age=32
Country=US
Allergy=True
Allergy1
Allergy2
Allergy3
Chase
Sex=F
Age=34
Country=UK
Allergy=False
;
data list ;
  if eof then call symputx('nlines',_n_);
  infile text end=eof;
  retain id 0 start_row -1;
  input;
  start_row+1;
  if _infile_=: 'Sex=' then do; id+1; output; end;
run;

data list;
  set list end=eof;
  if eof then last_row=&nlines;
  else set list(firstobs=2 keep=start_row rename=start_row=last_row);
  last_row=last_row-1;
run;

data want;
  set list ;
  infile text truncover ;
  length name $30 sex $1 age 8 country $20 Allergy $5 Allergy_list $200 ;
  input name / @5 sex / @5 age/ @9 country / @9 Allergy;
  do _n_=start_row+5 to last_row;
    input; allergy_list=catx('|',allergy_list,_infile_);
  end;
run;

proc print;
run;
             start_
Obs    id      row     last_row    name     sex    age    country    Allergy           Allergy_list

 1      1       1          5       John      M      31      US        False
 2      2       6         13       Mark      M      32      US        True      Allergy1|Allergy2|Allergy3
 3      3      14         18       Chase     F      34      UK        False
Ksharp
Super User
data test;
infile cards truncover;
input temp $100.;
cards;
John
Sex=M
Age=31
Country=US
Allergy=False
Mark
Sex=M
Age=32
Country=US
Allergy=True
Allergy1
Allergy2
Allergy3
Chase
Sex=F
Age=34
Country=UK
Allergy=False
;
data temp;
merge test test(firstobs=2 rename=(temp=_temp));
if _temp =: 'Sex=' then group+1;
run;
data temp;
 set temp;
 by group;
 length name $ 40;
 if first.group then do;n=0;name='Name';value=temp;end;
  else if findc(temp,'=') then do;name=scan(temp,1,'=');value=scan(temp,-1,'=');end;
   else do;n+1;name=cats('Allergy',n);value=temp;end;
keep name value group;
run;
proc transpose data=temp out=want(drop=_name_);
by group;
id name;
var value;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1071 views
  • 1 like
  • 5 in conversation