BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
acordes
Rhodochrosite | Level 12

It's trial and error to get the import acting like expected.

I use truncover, missover and dsd like it was a lotery ticket. But it's because it never does what I want it to do.

When I explicitly declare ";" as a delimiter, why the execution does not interpret them in that way?

Why it ignores blank columns instead of writing missing to the corresponding columns?

Why yymmdd10. does not work for the date columns?

 

The log says:

WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
21 2011;01;03;XXXXXXXXXXXX; ; ;XXXXXXXX ;2004-12-21;2006-09-14
101 ;R; ;N;

 

 

FILENAME REFFILE '/caslibs/risknb/IFRS9_CONTRATOS_202008.TXT' encoding=wlatin1;     


data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';' truncover MISSOVER  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODMMEST CODEMPRE CODOPERA $char12. temp1 temp2  CODIGIF $char10. FECFORMO $CHAR10.  FECVECI $CHAR20. 
TIPOCLIE $32. RATICLI  8 XEMPLEAD $32. CODPRODU $32. ESTVEHIC $32. PCTIEFEC  8 PCTOMI  8 IMPCAPE  8 
IMPSALVE  8 FECMATRI $32. FECREVEH $32. IMPVRESI  8 FECPRIAC $32. XIDPOCI $32. XIDFALL $32. XDEFAULT $32.
XIDSUBJ $32. FECDEFAU $32. IMPPDDEF  8 IMPVCDEF  8 XISTAGE  8 IMPDOTAC  8 QDIAATI  8 QDIASMAD  8 
QMESESDF  8 PCTPD  8 PCTLGD  8 REMTBOOK  8 TIMEHORI  8 QAAPDTDF  8 IMPEXPDF  8 IMPDOTCU  8 IMPDOTDU
  8 IMPDOTVR  8 FECDATOS $32. USOFUTUR  8 CODSEGPD  8 CODSELGD  8 ;
run;

for.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Try running next code, derived from yours with next changes:

1) omit all informat length, just assign $ for char=type variables;
    one exception for dates to be read with :yymmdd10. informat;

2) Add a LENGTH statement to assign the max length of char=type variables;

3) %let is a declarative statement and not a datastep statemnt.
     better move it before the datastep.

4) add a FORMAT statement to dates;

 

FILENAME REFFILE '/caslibs/risknb/IFRS9_CONTRATOS_202008.TXT' encoding=wlatin1;     

%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
data public.test    ;
 LENGTH <all char-type variables and their max length> ;
 INFILE REFFILE delimiter = ';' truncover MISSOVER  lrecl=32767 firstobs=2 n=1000;
 INPUT CODAAEST CODMMEST CODEMPRE CODOPERA $ temp1 temp2  CODIGIF $ FECFORMO $
   FECVECI TIPOCLIE $ RATICLI  XEMPLEAD $ CODPRODU $ ESTVEHIC $ PCTIEFEC   PCTOMI 
   IMPCAPE IMPSALVE  FECMATRI $ FECREVEH $ IMPVRESI  FECPRIAC $ XIDPOCI $ XIDFALL $
   XDEFAULT $ XIDSUBJ $ FECDEFAU $ IMPPDDEF  IMPVCDEF  XISTAGE  IMPDOTAC  QDIAATI  
   QDIASMAD QMESESDF  PCTPD  PCTLGD  REMTBOOK TIMEHORI QAAPDTDF  MPEXPDF  IMPDOTCU 
   IMPDOTDU IMPDOTVR  FECDATOS $ USOFUTUR  CODSEGPD  CODSELGD  ;
run;

 

 

View solution in original post

13 REPLIES 13
ballardw
Super User

When you have informats as part of the input statement the way you have those then the number of characters in the informat is read. Period. Which means some of the delimiters end up in the data when the actual values are shorter or missing.

 

Try adding a : in front of all the $ formats.

Or have a separate INFORMAT statement with the character variables and remove the informats from the INPUT statement.

 

Is there some specific reason you want the date values to be character? Any manipulation of the dates will require creating a date value. You could likely save yourself some headaches later by reading the dates with yymmdd10. informat and then assign any desired format so people understand them.

acordes
Rhodochrosite | Level 12

Thank you ballardw.

But it's getting only slightly better, the proceeding : before the char format helps.

But when I try to define the informat everything turns chaotic again.

 

here are the last 2 versions.

data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';' truncover dsd  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODMMEST CODEMPRE CODOPERA :$char12. temp1 temp2  CODIGIF :$char10. FECFORMO  FECVECI 
TIPOCLIE :$32. RATICLI 12 XEMPLEAD :$32. CODPRODU :$32. ESTVEHIC :$32. PCTIEFEC  12 PCTOMI 12 IMPCAPE  12 
IMPSALVE  12 FECMATRI :$32. FECREVEH :$32. IMPVRESI  12 FECPRIAC :$32. XIDPOCI :$32. XIDFALL :$32. XDEFAULT :$32.
XIDSUBJ :$32. FECDEFAU :$32. IMPPDDEF  12 IMPVCDEF  12 XISTAGE  12 IMPDOTAC  12 QDIAATI  12 QDIASMAD  12 
QMESESDF  12 PCTPD  12 PCTLGD  12 REMTBOOK  12 TIMEHORI  12 QAAPDTDF  12 IMPEXPDF  12 IMPDOTCU  12 IMPDOTDU
  12 IMPDOTVR  12 FECDATOS  USOFUTUR  12 CODSEGPD  12 CODSELGD  12 ;
informat FECFORMO FECVECI fecdatos yymmdd10.;
run;

data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';' truncover dsd  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODAAEST CODMMEST CODEMPRE CODOPERA temp1 temp2 CODIGIF FECFORMO FECVECI TIPOCLIE RATICLI XEMPLEAD CODPRODU ESTVEHIC PCTIEFEC 
PCTOMI IMPCAPE IMPSALVE FECMATRI FECREVEH IMPVRESI FECPRIAC XIDPOCI XIDFALL XDEFAULT XIDSUBJ FECDEFAU IMPPDDEF IMPVCDEF XISTAGE 
IMPDOTAC QDIAATI QDIASMAD QMESESDF PCTPD PCTLGD REMTBOOK TIMEHORI QAAPDTDF IMPEXPDF IMPDOTCU IMPDOTDU IMPDOTVR FECDATOS USOFUTUR 
CODSEGPD CODSELGD;
informat CODOPERA $12. CODIGIF $10.  XIDPOCI XIDFALL XDEFAULT XIDSUBJ $1.;
run;
ballardw
Super User

@acordes wrote:

Thank you ballardw.

But it's getting only slightly better, the proceeding : before the char format helps.

But when I try to define the informat everything turns chaotic again.

 

here are the last 2 versions.

data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';' truncover dsd  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODMMEST CODEMPRE CODOPERA :$char12. temp1 temp2  CODIGIF :$char10. FECFORMO  FECVECI 
TIPOCLIE :$32. RATICLI 12 XEMPLEAD :$32. CODPRODU :$32. ESTVEHIC :$32. PCTIEFEC  12 PCTOMI 12 IMPCAPE  12 
IMPSALVE  12 FECMATRI :$32. FECREVEH :$32. IMPVRESI  12 FECPRIAC :$32. XIDPOCI :$32. XIDFALL :$32. XDEFAULT :$32.
XIDSUBJ :$32. FECDEFAU :$32. IMPPDDEF  12 IMPVCDEF  12 XISTAGE  12 IMPDOTAC  12 QDIAATI  12 QDIASMAD  12 
QMESESDF  12 PCTPD  12 PCTLGD  12 REMTBOOK  12 TIMEHORI  12 QAAPDTDF  12 IMPEXPDF  12 IMPDOTCU  12 IMPDOTDU
  12 IMPDOTVR  12 FECDATOS  USOFUTUR  12 CODSEGPD  12 CODSELGD  12 ;
informat FECFORMO FECVECI fecdatos yymmdd10.;
run;

data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';' truncover dsd  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODAAEST CODMMEST CODEMPRE CODOPERA temp1 temp2 CODIGIF FECFORMO FECVECI TIPOCLIE RATICLI XEMPLEAD CODPRODU ESTVEHIC PCTIEFEC 
PCTOMI IMPCAPE IMPSALVE FECMATRI FECREVEH IMPVRESI FECPRIAC XIDPOCI XIDFALL XDEFAULT XIDSUBJ FECDEFAU IMPPDDEF IMPVCDEF XISTAGE 
IMPDOTAC QDIAATI QDIASMAD QMESESDF PCTPD PCTLGD REMTBOOK TIMEHORI QAAPDTDF IMPEXPDF IMPDOTCU IMPDOTDU IMPDOTVR FECDATOS USOFUTUR 
CODSEGPD CODSELGD;
informat CODOPERA $12. CODIGIF $10.  XIDPOCI XIDFALL XDEFAULT XIDSUBJ $1.;
run;

Try placing the INFORMAT prior to the Input statement. Your second data step is missing a lot of variables from the first one, many of the character variables. So if these are supposed to read the same file then you are reading columns into the wrong variables.

Kurt_Bremser
Super User

See this example:

infile REFFILE
  delimiter = ';'
  dsd /* this option causes two delimiters being read as a missing value */
  truncover
  lrecl=32767
  firstobs=2
;
input
  CODAAEST :$7. /* read codes as character */
  CODMMEST :$2.
  CODEMPRE :$2.
  CODOPERA :$char12.
  temp1
  temp2
  CODIGIF :$char10.
  FECFORMO :yymmdd10.
  FECVECI :yymmdd10.
;
format
  FECFORMO yymmdd10.
  FECVECI yymmdd10.
;

add other variables in the same manner as required.

acordes
Rhodochrosite | Level 12

nearly...

 

works fine with this code, but commax format does not behave well for the second variable, the first is applied correctly.

and the informat for the date does nothing.

later I'll try your approach

 

data public.test    ;
        %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
        infile REFFILE delimiter = ';'  lrecl=32767 firstobs=2 n=1000;
input CODAAEST CODMMEST CODEMPRE CODOPERA :$char12. temp1 temp2  CODIGIF :$char10. FECFORMO  FECVECI 
TIPOCLIE :$1. RATICLI XEMPLEAD :$1. CODPRODU :$2. ESTVEHIC :$1. PCTIEFEC commax8.5  PCTOMI commax8.5   IMPCAPE  
IMPSALVE FECMATRI  FECREVEH  IMPVRESI  FECPRIAC  XIDPOCI :$1. XIDFALL :$1. XDEFAULT :$1.
XIDSUBJ :$1. FECDEFAU  IMPPDDEF  IMPVCDEF  XISTAGE   IMPDOTAC  QDIAATI   QDIASMAD  
QMESESDF PCTPD  PCTLGD  REMTBOOK  TIMEHORI  QAAPDTDF  IMPEXPDF  IMPDOTCU IMPDOTDU
IMPDOTVR  FECDATOS  USOFUTUR  CODSEGPD   CODSELGD   ;
informat FECFORMO FECVECI  FECMATRI FECREVEH FECPRIAC FECDEFAU fecdatos yymmdd10.;
run;

 

for2.png

Kurt_Bremser
Super User

I gave you an example for nicely structured, easily readable code. Please stop writing such ugly spaghetti code.

 

Don't use fractions in informats when commas are present to mark the decimal dot. Your informat will override the comma in the data.

acordes
Rhodochrosite | Level 12

Okay.

But the problem remains even with your nicely structured code.

What format or informat reads correctly those variables where I wrongly apply the commax format?

 

Kurt_Bremser
Super User

@acordes wrote:

Okay.

But the problem remains even with your nicely structured code.

What format or informat reads correctly those variables where I wrongly apply the commax format?

 


Would you please take the time and read my post in its entirety? I already gave you the answer.

Astounding
PROC Star

Let's focus on this part of the INPUT statement, where you are still not satisfied with the results:

 

PCTIEFEC commax8.5  PCTOMI commax8.5 

As was mentioned, you don't need to be using commax8.5.  Let the raw data indicate where the decimal point should go.  So the first change would be:

 

PCTIEFEC commax8.  PCTOMI commax8. 

Next, I would assume you are happy with the value assigned to PCTIEFEC, but need to fix the value assigned to PCTOMI.  To do that, recognize that the problem is that the first character read for PCTOMI is the semicolon between the variables.  That won't be properly interpreted (generating your log message about invalid data for PCTOMI), unless you add a colon.  Do it for both variables:

 

PCTIEFEC : commax8.  PCTOMI : commax8.

That change should take care of those two variables, and would give you a roadmap to follow if others are causing trouble.

Shmuel
Garnet | Level 18

Try running next code, derived from yours with next changes:

1) omit all informat length, just assign $ for char=type variables;
    one exception for dates to be read with :yymmdd10. informat;

2) Add a LENGTH statement to assign the max length of char=type variables;

3) %let is a declarative statement and not a datastep statemnt.
     better move it before the datastep.

4) add a FORMAT statement to dates;

 

FILENAME REFFILE '/caslibs/risknb/IFRS9_CONTRATOS_202008.TXT' encoding=wlatin1;     

%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
data public.test    ;
 LENGTH <all char-type variables and their max length> ;
 INFILE REFFILE delimiter = ';' truncover MISSOVER  lrecl=32767 firstobs=2 n=1000;
 INPUT CODAAEST CODMMEST CODEMPRE CODOPERA $ temp1 temp2  CODIGIF $ FECFORMO $
   FECVECI TIPOCLIE $ RATICLI  XEMPLEAD $ CODPRODU $ ESTVEHIC $ PCTIEFEC   PCTOMI 
   IMPCAPE IMPSALVE  FECMATRI $ FECREVEH $ IMPVRESI  FECPRIAC $ XIDPOCI $ XIDFALL $
   XDEFAULT $ XIDSUBJ $ FECDEFAU $ IMPPDDEF  IMPVCDEF  XISTAGE  IMPDOTAC  QDIAATI  
   QDIASMAD QMESESDF  PCTPD  PCTLGD  REMTBOOK TIMEHORI QAAPDTDF  MPEXPDF  IMPDOTCU 
   IMPDOTDU IMPDOTVR  FECDATOS $ USOFUTUR  CODSEGPD  CODSELGD  ;
run;

 

 

acordes
Rhodochrosite | Level 12

Thanks to @ballardw for the hint with the : and the informat.

Thanks to @Astounding for the commax explanation.

Thanks to @Shmuel for the final tips & tricks that let me approach the final solution.

and thanks to @Kurt_Bremser for his patience and for the incentive he gives me to keep learning more.

 

The final code is as follows, like an ensemble.

FILENAME REFFILE '/caslibs/risknb/IFRS9_CONTRATOS_202008.TXT' encoding=wlatin1;     

%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
data public.test    ;
LENGTH CODOPERA $12 CODIGIF $10 XEMPLEAD $1 CODPRODU $2 ESTVEHIC $1 XIDPOCI $1 XIDFALL $1 XDEFAULT $1
XIDSUBJ $1;

informat FECFORMO FECVECI FECMATRI FECREVEH FECPRIAC FECDEFAU fecdatos yymmdd10. ;
format FECFORMO FECVECI FECMATRI FECREVEH FECPRIAC FECDEFAU fecdatos date9. PCTIEFEC PCTOMI PCTPD PCTLGD percent9.2;

INFILE REFFILE delimiter = ';' truncover MISSOVER  lrecl=32767 firstobs=2 n=1000;
INPUT CODAAEST CODMMEST CODEMPRE CODOPERA $ temp1 temp2  CODIGIF $ FECFORMO $
   FECVECI TIPOCLIE $ RATICLI  XEMPLEAD $ CODPRODU $ ESTVEHIC $ PCTIEFEC : commax8.  PCTOMI : commax8. 
   IMPCAPE IMPSALVE  FECMATRI $ FECREVEH $ IMPVRESI  FECPRIAC $ XIDPOCI $ XIDFALL $
   XDEFAULT $ XIDSUBJ $ FECDEFAU $ IMPPDDEF  IMPVCDEF  XISTAGE  IMPDOTAC  QDIAATI  
   QDIASMAD QMESESDF  PCTPD : commax8.   PCTLGD : commax8.  REMTBOOK TIMEHORI QAAPDTDF  MPEXPDF  IMPDOTCU 
   IMPDOTDU IMPDOTVR  FECDATOS $ USOFUTUR  CODSEGPD  CODSELGD  ;

array div100(4) PCTIEFEC PCTOMI PCTPD PCTLGD;

do i=1 to dim(div100);
div100(i)=div100(i)/100;
end;

drop temp: i;

run;
Tom
Super User Tom
Super User

It looks a lot like you have mixed up the syntax of the LENGTH statement and the INPUT statement.

 

If this a LENGTH statement then just want to use $nnn for the character variables. No need for the period, lengths are always integer values and are not format specifications.  So you cannot use an informat like $CHAR in a LENGTH statement.

 

If it is an INPUT statement you do not want to read all of those numeric variables as the single digit in column 8.  You should just remove the 8 and not have an informat for those variables.  And for the variables where you do want to include an informat specification you want to add the colon modifier in front to prevent the informat from reading past the delimiters.

 

It is much clearer and easier if you first define the variables with a LENGTH statement. Then the INPUT statement just needs to list the variables. So you are not forcing the INPUT statement to do double duty.   If any of the variables need to be read with a special informat (and most numbers or character strings do NOT need special input instructions) then add an INFORMAT statement. If any of the variables need special instructions for how to display them (and most numbers and character strings do NOT need special output instructions) then add a FORMAT statement.

 

So perhaps your data step will look something like this. 

data public.test ;
  infile refile dsd dlm= ';' truncover firstobs=2 ;
  length CODAAEST CODMMEST CODEMPRE 8
         CODOPERA $12 
         temp1 temp2 CODIGIF FECFORMO $10  
         FECVECI $20 TIPOCLIE $32 RATICLI 8 
         XEMPLEAD CODPRODU ESTVEHIC $32 
         PCTIEFEC PCTOMI IMPCAPE IMPSALVE 8
         FECMATRI FECREVEH $32 
         IMPVRESI 8 
         FECPRIAC XIDPOCI XIDFALL XDEFAULT XIDSUBJ FECDEFAU $32 
         IMPPDDEF IMPVCDEF  XISTAGE IMPDOTAC QDIAATI QDIASMAD  8 
         QMESESDF PCTPD PCTLGD REMTBOOK TIMEHORI QAAPDTDF IMPEXPDF  8
         IMPDOTCU IMPDOTDU IMPDOTVR  8 
         FECDATOS $32 
         USOFUTUR CODSEGPD CODSELGD  8 
  ;
  input CODAAEST -- CODSELGD ;
run;

 

acordes
Rhodochrosite | Level 12
Tom, that's very didactic. Thank you

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 3895 views
  • 5 likes
  • 6 in conversation