BookmarkSubscribeRSS Feed
scify
Obsidian | Level 7

I'm getting an odd error when running certain functions on date variables in a dataset.

 

I have the following code:

data example; set base.dataset;
	added_days=0;
	if months>0 then added_days=months*30;

	*Change the release date to account for the additional days;
	new_rel=intnx('days',reldate,added_days);
	format new_rel mmddyy8.;

	*Calculate age at release using YRDIF;
	if dob ne . and new_rel ne . then relage=round(yrdif(dob,new_rel,"AGE"));

	*Add a variable to conditionally exclude non-2005 releases;
	exclude=0;
	if year(new_rel) ne 2005 then exclude=1;
run;

At the intnx() and year() lines, SAS is throwing up a Missing Value warning for about 20% of my observations. I've checked the data, and reldate, new_rel, and added_days have no missing values that I can find. The warning doesn't show on the yrdif() line, presumably because of the "new_rel ne ." condition. The data seems fine when I run procedures on it afterwards.

 

Any suggestions for what's going on here?

11 REPLIES 11
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Can you post a couple of lines of test data which reflects the data you have.  This seems to run fine, so I assume your data is at fault somewhere:

 

data temp;
  reldate="01jan2104"d; dob="10oct2010"d; months=.; output;
  reldate="12jan2104"d; dob="10oct2010"d; months=2; output;
run;
data example; 
  set temp;
	added_days=0;
	if months>0 then added_days=months*30;

	*Change the release date to account for the additional days;
	new_rel=intnx('days',reldate,added_days);
	format new_rel mmddyy8.;

	*Calculate age at release using YRDIF;
	if dob ne . and new_rel ne . then relage=round(yrdif(dob,new_rel,"AGE"));

	*Add a variable to conditionally exclude non-2005 releases;
	exclude=0;
	if year(new_rel) ne 2005 then exclude=1;
run;
scify
Obsidian | Level 7
reldate    months       days     new_rel

16910         0           0      16910
16873         0           0      16873
16785         0           0      16785
16778         0           0      16778
16776         0           0      16776
16897         0           0      16897
17276         6           0      17276
16775         0           0      16775
16778         0           0      16778
16814         0           0      16814
16707         0           0      16707
16789         0           0      16789
16777         0           0      16777
16715         0           0      16715
16722         0           0      16722
16743         5         150      16893
16651         0           0      16651
16687         0           0      16687
16783         2          60      16843
17090         0           0      17090
16768         0           0      16768
16777         0           0      16777
16939         0           0      16939
16611         0           0      16611
16499         0           0      16499
16950         0           0      16950
16791         0           0      16791
16776         0           0      16776
16763         0           0      16763
17122         0           0      17122
16814         0           0      16814
16785         0           0      16785
16716         0           0      16716

The majority of the observations have a value of 0 for months, so added_days is generally 0. The number of observations that are throwing the Missing Value warning (5,541) is much larger than the number of observations with a value other than 0 for added_days (1,563). All of the values for added_days are a multiple of 30, ranging from 30 to 540.

FreelanceReinh
Jade | Level 19

If the data step does not stop prematurely, dataset EXAMPLE will have the same number of observations as BASE.DATASET. So, if NEW_REL has a missing value (generated by applying INTNX to a missing argument, as your description suggests) in the n-th observation of EXAMPLE, I would scrutinize the relevant variable values of n-th observation of BASE.DATASET. Maybe you will find a special missing value, which would not satisfy a condition like "reldate eq .".

scify
Obsidian | Level 7

I'm not getting specific information as to which observations are causing problems. All I see in the log is:

NOTE: Missing values were generated as a result of performing an operation on missing values.
      Each place is given by: (Number of times) at (Line):(Column).
      5541 at 834:13   5541 at 852:8
Reeza
Super User

You most likely have missing data. Does the following generate any missing data in the table? 

 

proc freq data=base.dataset;
table reldate added_dates new_rel/missing;
run;
scify
Obsidian | Level 7

Nope. All observations show as having a non-missing value for those variables.

 

Edit: Which is to say that neither reldate nor months show as missing in the base dataset. added_days and new_rel don't exist there; they're created in the data step I posted as part of the question (and they don't show any missing values, either).

Reeza
Super User

INTNX should have a parameter of DAY not DAYS

 

You can also explicitly output the errors when they occur using PUT statements.

 

data example; set base.dataset;
	added_days=0;
	if months>0 then added_days=months*30;

	*Change the release date to account for the additional days;
	new_rel=intnx('day',reldate,added_days);
	format new_rel mmddyy8.;

        if new_rel = . then do;
            put 'Reldate:'  reldate;
            put 'Added_Days:' added_days;
         end;

	*Calculate age at release using YRDIF;
	if dob ne . and new_rel ne . then relage=round(yrdif(dob,new_rel,"AGE"));

	*Add a variable to conditionally exclude non-2005 releases;
	exclude=0;
	if year(new_rel) ne 2005 then exclude=1;
run;

 

scify
Obsidian | Level 7
SAS doesn't seem to care either way about day vs days, but the PUT is showing that the problem observations are missing on reldate. Which is weird, because there aren't any missing values for that variable on either the input or output dataset.
FreelanceReinh
Jade | Level 19

Earlier you wrote that "All I see in the log is" the note "Missing values were generated ...". However, if INTNX is applied to a missing second (or third) argument, that note should be preceded by a note of the following form:

NOTE: Argument 2 to function INTNX('days',.,1) at line 323 column 10 is invalid.
reldate=. months=0 dob=-2356 added_days=1 new_rel=. relage=. exclude=1 _ERROR_=1 _N_=1234

In this example you would see that observation no. 1234 must contain the missing value. What do the corresponding two lines in your log look like? Actually there should be more than just two, given your 5541 "missing value" notes.

Reeza
Super User
Please post the output from the following, and the log.

proc meansdata=base.dataset N NMISS;
var reldate;
output out=check n=N nmiss=Missing;
run;
Peter_C
Rhodochrosite | Level 12
From the log msg shown here identify the lines in the code where the problem occurs.
insert there
If {var name becoming missing} =. Then do;
Mycount+1; if mycount<5 then put _all_; end;
That should dump all vars to the log and you might see what is going wrong at that point

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1557 views
  • 0 likes
  • 5 in conversation