DATA Step, Macro, Functions and more

Is there a better way to address missing date value issue?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

Is there a better way to address missing date value issue?

Hi everyone,

Good afternoon. I was trying to resolve the missing date issue as shown in the name "Masters T."  27. I tried several options including dlm dsd missover, and finally the if then statement worked. However, the log still shows:

 

NOTE: Invalid data for Date in line 98 14-20.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
98 "Masters,T." 27
Name=Masters,T. Date=. Amt=27 _ERROR_=1 _N_=5
NOTE: The data set WORK.AMOUNTS has 6 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.10 seconds
cpu time 0.06 seconds

 

Programmatically, what is the correct way to address the missing date value? Please advise. Thank you very much for your review and help.

 

data clients;

infile datalines dsd dlm=' ' MISSOVER;
length Name $15.;
input Name $ EmpID $;
datalines;
"Ankerton,L." 11123
"Davis,R." 22298
"Masters,T." 33351
"Womer,B." 44483
;
data Amounts;
infile datalines dsd dlm=' ' MISSOVER;
length Name $15.;
input Name $
Date date7.
Amt dollar3.;
if Date=' ' then Amt='27';
datalines;
"Ankerton,L." 08OCT96 92
"Ankerton,L." 15Oct96 43
"Davis,R." 04Oct96 16
"Masters,T." 13Oct96 18
"Masters,T."   27
"Thomas,A." 21Oct96 15
;
proc sort data=clients;
by Name;
run;
proc sort data=Amounts;
by Name;
run;
data clientsamtcomb;
merge clients Amounts;
BY Name;
run;
proc print data=clientsamtcomb;
format date date7. Amt dollar3.;
run;


Accepted Solutions
Solution
‎10-10-2017 04:13 PM
Respected Advisor
Posts: 4,925

Re: Is there a better way to address missing date value issue?

I suggest that you use:

 

data Amounts;
infile datalines dsd dlm=' ' TRUNCOVER;
length Name $15;
input 
    Name $ 
    Date ?? :date.
    Amt dollar10.; 
format date yymmdd.;
datalines;
"Ankerton,L." 08OCT96 92
"Ankerton,L." 15Oct96 43
"Davis,R." 04Oct96 16
"Masters,T." 13Oct96 18
"Masters,T."   27
"Thomas,A." 21Oct96 15
;
PG

View solution in original post


All Replies
Super User
Posts: 5,509

Re: Is there a better way to address missing date value issue?

The right way to refer to missing values:

 

if date = . then amount = 27;

 

On a side note, you may need to increase the format width for printing AMOUNT.  If AMOUNT if 110 (which could be read with the DOLLAR3. informat in your top DATA step), you would still need 4 characters (not 3) to print out the value as $110.  So the FORMAT statement in the final PROC PRINT should use:

 

format amount dollar4.;

 

To get back to your original question, the right way to address the data problem is to fix the data.  Try adding an extra blank in the data in place of the missing date.  Then the combination of DSD and DLM=' ' should be able to figure out what belongs where.  No guarantees on that, but worth a try.

Trusted Advisor
Posts: 1,022

Re: Is there a better way to address missing date value issue?

When the date is missing, you see that AMT is not read in.  In your example the number 27 is read with informat DATE7. and fails, and also AMT is set to missing.  So you put in this fix:

   if date=' ' then amt=27;

which has 2 problems

  1. The expression IF DATE=' ' compares the character value " " to the numeric variable DATE.  You should use:
        IF DATE=.
    which will eliminate the "Character values have been converted to numeric values" note

  2. More importantly you're assigning the value 27, which works great if DATE is missing only when the intended AMT is 27.  You need a more general correction:
data amounts;
  infile datalines dsd dlm=' ' MISSOVER;
  length Name $15.;
  input Name $    Date date7.    Amt dollar3.  ;
  format date date9.;
  if date=. and amt=. then amt=input(scan(_infile_,-1,' '),best32.);
datalines;
"Ankerton,L." 08OCT96 92
"Ankerton,L." 15Oct96 43
"Davis,R." 04Oct96 16
"Masters,T." 13Oct96 18
"Masters,T."   27
"Thomas,A." 21Oct96 15
run;

 

This assumes that AMT is always present and is always the last space-separated term in the input line (which is referred to via the automatic variable _INFILE.  The SCAN function gets the "-1" term (i.e. the last term - first from the end).

Solution
‎10-10-2017 04:13 PM
Respected Advisor
Posts: 4,925

Re: Is there a better way to address missing date value issue?

I suggest that you use:

 

data Amounts;
infile datalines dsd dlm=' ' TRUNCOVER;
length Name $15;
input 
    Name $ 
    Date ?? :date.
    Amt dollar10.; 
format date yymmdd.;
datalines;
"Ankerton,L." 08OCT96 92
"Ankerton,L." 15Oct96 43
"Davis,R." 04Oct96 16
"Masters,T." 13Oct96 18
"Masters,T."   27
"Thomas,A." 21Oct96 15
;
PG
Occasional Contributor
Posts: 16

Re: Is there a better way to address missing date value issue?

[ Edited ]

Hi PGStats,

 

Your solution worked really well as it removed the error that I had received when trying with the "IF ...THEN" statement. Thank you so much for your timely help. I appreciate it. The fact that the combination of DSD and MISSOVER or individually on their own should have worked and not sure why they did not work. 

 

Thanks again,

Murali

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 132 views
  • 4 likes
  • 4 in conversation