turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Communities Library
- /
- Tips: Part 2, Identifying and Locating Missing Val...

- Article History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

by
Jennifer_beeman_sas_com
on
12-18-2014
10:40 AM
- edited on
10-05-2015
02:45 PM
by
ShelleySessoms
(1,070 Views)

Labels:

Long time series are often filled with missing values and gaps in time, but determining if your series has missing values and perhaps locating these values, isn’t always as easy as printing the data.

As a follow up to last week’s post, I will now explain how to use PROC TIMEDATA to find gaps in a time series.

I realize in my prior article I stated that I would be using PROC TIMESERIES, but for this example, “TIMESERIES” and “TIMEDATA” are interchangeable using this simple code shown.

You can simply replace the word TIMEDATA with TIMESERIES and keep the remaining syntax the same for the same results.

- Using PROC TIMEID, spans component, print where spans >1.
- Using PROC TIMEDATA

PROC TIMEDATA will tell you how many missing variables you have, but will not tell you the number of gaps or where they are.

proc timedata data=here.neah outsum=outsum1 out=out1;

id date interval=day ;

var varname;

run;

proc print data=outsum1;

run;

From the OUTSUM data set, you will see this table:

From the OUT data set, you can find where the series is missing by using the following code:

proc print data=out1;

where _East__mm_= .;

run;

If we were dealing with all types of missing values, in the instance where the data set is already embedded with missing values rather than just gaps in time;

the following code would be more useful as to not overlook any special case missing values.

where nmiss(_East__mm_) >= 0.;

Either will produce a table where the variable contains missing values. The only difference is that if we have special types of missing values in the data as well as gaps, they will not surface with the first code.

This would be an interesting topic to talk about at another time (Special Case Missing Values).

Summary:

PROC TIMEID tells you the number of time gaps greater than one interval, and then will locate them with a little extra code.

PROC TIMEDATA will tell you how many observations are actually missing.

It depends on which is more important to you, actual missing values or overall gaps in the data.

Additionally, TIMEID is able to determine the best interval if you are not sure of what it should be.

Both procedures are able to handle this task and will give you desirable results in a short amount of time.

Your turn

Sign In!

Want to write an article? Sign in with your profile.

Looking for the **Ask the Expert** series? Find it in its new home: communities.sas.com/askexpert.