BookmarkSubscribeRSS Feed
eyb
Calcite | Level 5 eyb
Calcite | Level 5

Hi All, 

 

I'm a newcomer to SAS programming, so apologies in advance if I'm asking obvious questions.

 

I have a project where I receive a fixed width text file that has differeing "columns" by row. 

 

For example the layout of a file might be:

 

AAAFirstNameLastNameDob

BBBSomeDetails

BBBMoreDetails

BBBYetMoreDetails

ZZZDoneWithPerson

 

…and it can have multiple people captured in a single file. So multiple AAABBBZZZ sections repeat in the file.

 

 

Example data may be:

 

AAABobDobs2016-01-01

BBBBeer  Pizza

BBBCoke  Ham

BBBWater Chicken

ZZZBobDobs2016-01-01

AAASueDobs2016-02-01

BBBTea   Crackers

BBBAcid  Fingernails

BBBJenkemMudPies

ZZZSueDobs2016-02-01

 

 I want to write a script to turn this into data set like this, where I split out the BBB records into columns, and attach to each BBB row, extra columns with informatino from the preceding AAA record:

 

FNAME LNAME  DATE         DRINK    FOOD

Bob   Dobs   2016-01-01   Beer     Pizza

Bob   Dobs   2016-01-01   Coke     Ham

Bob   Dobs   2016-01-01   Water    Chicken

Sue   Dobs   2016-01-01   Tea      Crackers

Sue   Dobs   2016-01-01   Acid     Fingernails

Sue   Dobs   2016-01-01   Jenkem   Mudpies

 

 

I think I should be able to do this with the data step, but it's unclear to me how. So far I have the file imported with code like this:

 

DATA WORK.input;
LENGTH
F1 $ 512 ;
FORMAT
F1 $CHAR512. ;
INFORMAT
F1 $CHAR512. ;
INFILE '/path/file.txt'
LRECL=512
ENCODING="LATIN1"
TERMSTR=CRLF
TRUNCOVER ;
INPUT
@1 F1 $CHAR512. ;
RUN;

 

Can anyone suggest where I should go next to get this working?

 

Thanks!

5 REPLIES 5
LinusH
Tourmaline | Level 20

You need to parse the input string somehow.

On approach is to use conditional INPUT.

Or scan the whole record (in a SAS variable, or by using the automatic variable _INFILE_).

Either way, this calls for competence from an intermediate programmer, so perhaps you might want to look into some Base SAS programming training.

Data never sleeps
eyb
Calcite | Level 5 eyb
Calcite | Level 5

Thanks for the reply. I'm more of a learn by doing person, but I do plan on doing a course once my company has funds to pay for it.

 

 

 

 

ballardw
Super User

First a comment: I doubt that your data is actually fixed width unless every single name in your data is exactly 3 characters for first name and 4 characters for last name. 

Also post Code and data in code boxes opened in the forurm with the {i} icon. The question box strips formatting and does proportional font things that may actually change your data. The Code box will prevent that.

 

Some questions, or possibly attach a TXT file with some more example data:

Does the line with the name information ALWAYS start with the same characters . Your post implies that if the first 3 characters are AAA then the line will be name information. Is that correct? And ZZZ always ends the record for the person?

 

data want;
   length code Fname $ 3 Lname $ 4 drink $ 6 food $ 11;
   informat date yymmdd10.;
   format date yymmdd10.;
   Retain fname lname date;
   input code 1-3 @;
   if code = 'AAA' then do;
      input fname 4-6
            lname 7-10
            date 
      ;
   end;
   Else if code='BBB' then do;
      input drink 4-9
            food  10-20
      ;
      output;
   End;
   Else if code='ZZZ' then do;
      input;
   end;
   drop code;
datalines;
AAABobDobs2016-01-01 
BBBBeer  Pizza
BBBCoke  Ham
BBBWater Chicken
ZZZBobDobs2016-01-01
AAASueDobs2016-02-01
BBBTea   Crackers
BBBAcid  Fingernails
BBBJenkemMudPies
ZZZSueDobs2016-02-01
;

works for the example data.

 

eyb
Calcite | Level 5 eyb
Calcite | Level 5

@ballardw wrote:

First a comment: I doubt that your data is actually fixed width unless every single name in your data is exactly 3 characters for first name and 4 characters for last name. 

 


 

I was creating an example for purposes of this conversation. My actual data file is much more complicated, but if I knew how to do it for this, I could make what I need for the other one. Thanks for the info, still digesting the rest of your reply, but wanted to add that note to start with.

 

I'll be sure to use code tags / attachments for future examples.

 


Some questions, or possibly attach a TXT file with some more example data:

Does the line with the name information ALWAYS start with the same characters . Your post implies that if the first 3 characters are AAA then the line will be name information. Is that correct? And ZZZ always ends the record for the person?

 


 

The first 3 chars, are always the record type in my data. The actual content is a bit different, but it is fixed width. Just many more columns. 

 

BrunoMueller
SAS Super FREQ

Hi

 

Back in the days when SAS was founded, this was a common type of file that would use different record types to indicate what the line was about. So basically in the code you figure out the record type and then do what is necessary.

 

See sample code below, please note that the data has fixed length for each field, are there any persons that have no record type BBB? What should haben with them?

 

data want;
  infile cards;
  /* determine the record type */
  input
    @1 recordType $3.
    @
  ;

  retain fname lname someDate;

  /* do something based on the record type */
  if recordType = "AAA" then do;
    input
      @4 fname $6.
      @10 lname $10.
      @21 someDate anydtdte10.
    ;
  end;

  /* only record type BBB needs to be written to output data set */
  if recordType = "BBB" then do;
    input
      @4 drink $6.
      @10 food $10.
    ;
    output;
  end;

  /* we can ignone ZZZ */

  drop recordType;
  format someDate date9.;

cards;
AAABob   Dobs       2016-01-01
BBBBeer  Pizza
BBBCoke  Ham
BBBWater Chicken
ZZZBob   Dobs       2016-01-01
AAASue   Dobs       2016-02-01
BBBTea   Crackers
BBBAcid  Fingernails
BBBJenkemMudPies
ZZZSue   Dobs       2016-02-01
AAAAnna  Dobs       2016-02-01
BBBTea   Crackers
ZZZAnna  Dobs       2016-02-01
;

Bruno

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 841 views
  • 0 likes
  • 4 in conversation