DATA Step, Macro, Functions and more

How to read observations in multiple records conditionally

Reply
SAS Employee
Posts: 13

How to read observations in multiple records conditionally

Hi all,

I'm having problems to read data which has some records in several observations, usually for a variable like ADRESS, which has enters in some rows.

In example, this is a line I can read without problems:

113682566;ZARATE JORGE JUAN;23-13682566-9;;16009.65;4802.895;0;0;0;369;694.05;PJ0110150;PERSONAL OBRERO DE MAESTRANZA Y SERVICIO;XXXX 0 ;0;29030-AUXILIAR PRINCIPAL TECNICO;01/12/1979;;;;

But I have some rows like this one:

113682656;CANO JORGE;20-13682656-1;;5475.65;1642.695;0;0;0;187.38;0;EE0117056;C.E.N.M.A. SEDE JEFAS Y JEFES DE HOGAR - RIO CEBALLOS;     KENNEDY

486 ;0;13910-HORAS CATEDRAS;25/03/2006;;;S;

I'm using the FLOWOVER option in the INFILE statement but the value of ADRESS (which begins with "KENNEDY" in the wrong line) is being truncated, passing the value of the number (486) to another variable. I've tried to use the & and / modifiers but it's not working.

Any suggestions?

Thanks in advance!

Super User
Super User
Posts: 7,970

Re: How to read observations in multiple records conditionally

Post the code you are using.  Are you using missover on the infile?

SAS Employee
Posts: 13

Re: How to read observations in multiple records conditionally

The code to read values is the following one:

     DATA EXAMPLE;

           LABEL

                ID_LEGAJO        = "CUIT"

                APENOM           = "APELLIDO Y NOMBRES"

                CTA_BPC          = "CTA BPC"

                CTA_MUT          = "CTA MUT"

                AFECT_MUT        = "AFECT MUT"

                ID_EMPRESA       = "ID EMPRESA"

                FECHA_INGRESO    = "FECHA INGRESO";

           FORMAT

                ID_LEGAJO        $CHAR27.

                APENOM           $CHAR40.

                CUIL             $CHAR40.

                FILLER           $CHAR10.

                INGRESO          BEST10.

                RCI              BEST11.

                EMBARGO1         BEST8.

                EMBARGO2         BEST8.

                CTA_BPC          BEST8.

                CTA_MUT          BEST8.

                AFECT_MUT        BEST8.

                ID_EMPRESA       $CHAR20.

                EMPRESA          $CHAR95.

                DIRECCION        $CHAR43.

                FILLER2          $CHAR34.

                CARGO            $CHAR40.

                FECHA_INGRESO_ADMPUB    $CHAR10.

                FUNCIONARIO      $CHAR1.

                CONTRATADO       $CHAR1.

                SUPLENTE         $CHAR1.

           ;

           INFORMAT

                ID_LEGAJO        $CHAR27.

                APENOM           $CHAR40.

                CUIL             $CHAR40.

                FILLER           $CHAR10.

                INGRESO          BEST10.

                RCI              BEST11.

                EMBARGO1         BEST8.

                EMBARGO2         BEST8.

                CTA_BPC          BEST8.

                CTA_MUT          BEST8.

                AFECT_MUT        BEST8.

                ID_EMPRESA       $CHAR20.

                EMPRESA          $CHAR95.

                DIRECCION        $CHAR43.

                FILLER2          $CHAR34.

                CARGO            $CHAR40.

                FECHA_INGRESO_ADMPUB    $CHAR10.

                FUNCIONARIO      $CHAR1.

                CONTRATADO       $CHAR1.

                SUPLENTE         $CHAR1.;

           INFILE "C:\EXAMPLE.txt"

                LRECL=32767

                ENCODING="WLATIN1"

                TERMSTR=CRLF

                DLM=';'

                FLOWOVER

                DSD;

           INPUT

                ID_LEGAJO        : $CHAR27.

                APENOM           : $CHAR40.

                CUIL             : $CHAR40.

                FILLER           : $CHAR10.

                INGRESO          : ?? COMMA10.

                RCI              : ?? COMMA11.

                EMBARGO1         : ?? COMMA8.

                EMBARGO2         : ?? COMMA8.

                CTA_BPC          : ?? COMMA8.

                CTA_MUT          : ?? COMMA8.

                AFECT_MUT        : ?? COMMA8.

                ID_EMPRESA       : $CHAR20.

                EMPRESA          : $CHAR95.

                DIRECCION        :  $CHAR43.

                FILLER2          : $CHAR34.

                CARGO            : $CHAR40.

                FECHA_INGRESO_ADMPUB    : $CHAR10.

                FUNCIONARIO      : $CHAR1.

                CONTRATADO       : $CHAR1.

                SUPLENTE         : $CHAR1.;

     RUN;

Does not work the MISSOVER OPTION, just makes to put the Adress value "486" in a new line.

NOTE: Adress variable is "DIRECCION" (file is in spanish)

Super User
Super User
Posts: 7,970

Re: How to read observations in multiple records conditionally

Hi,

Sorry, only just seen this reply.  It looks like you may have special characters in your text file if it really does look like this:

113682566;ZARATE JORGE JUAN;23-13682566-9;;16009.65;4802.895;0;0;0;369;694.05;PJ0110150;PERSONAL OBRERO DE MAESTRANZA Y SERVICIO;XXXX 0 ;0;29030-AUXILIAR PRINCIPAL TECNICO;01/12/1979;;;;

113682656;CANO JORGE;20-13682656-1;;5475.65;1642.695;0;0;0;187.38;0;EE0117056;C.E.N.M.A. SEDE JEFAS Y JEFES DE HOGAR - RIO CEBALLOS;     KENNEDY

486 ;0;13910-HORAS CATEDRAS;25/03/2006;;;S;

When you open with notepad.  The space between KENNEDY and 486 would seem to be LineFeef/CarriageReturn - which is normally associated with the end of a row.  Personally I would send the file back and get them to fix it their end as delimited files shouldn't have them in, this is what I did.  However Tech support also provided the code in my post:

Which you could use to pre-process the file and remove these.  Though I would recommend getting the vendor to fix the file.

Contributor
Posts: 29

Re: How to read observations in multiple records conditionally

In my experience, this kind of problem usually requires stripping out carriage returns using a preprocessing step.

I use the following macro:

%macro replace_crlf(inf,repchar='.');

data _null_;

infile "&inf." recfm=n sharebuffers;

file "&inf." recfm=n;

input a $char1.;

retain open 0;

** This statement toggles the open flag. ;

if a='"' then open=not open;

if a in ('0A'x,'0D'x) and open then put &repchar.;

run;

%mend replace_crlf;

%replace_crlf(filename.csv,repchar=' ');

This will alter the file in place...so if you don't want to risk losing the original, take a copy of it first!

Good luck.

Super User
Posts: 10,035

Re: How to read observations in multiple records conditionally

Make your file as a stream file ,then you can input it .

data want;

infile 'c:\temp\x.txt' recfm=n dlm=';'  dsd ;

input a : $100.;

run;

Xia Keshan

Ask a Question
Discussion stats
  • 5 replies
  • 271 views
  • 0 likes
  • 4 in conversation