deleting incomplete time series

Reply
New Contributor
Posts: 3

deleting incomplete time series

So I have a dataset imported. It has about twenty different time series with annual observations from 1950-2012. The time series start and end in different years for each variable. I want to write a macro to delete time series that have incomplete data in the middle. If the variable has missing observations from say 1950-1960 that's ok, as long as 1961-whenever is continuous (no holes). Basically I'm thinking for each variable I need to record when the time series starts, when it ends and check for missing values in between. I have no idea how to do this...

Super User
Posts: 10,046

Re: deleting incomplete time series

Your sample data ?

I think SQL can do that .

New Contributor
Posts: 3

Re: deleting incomplete time series

How exactly would you use SQL to do so?

Super User
Posts: 10,046

Re: deleting incomplete time series

I need to see your sample data . what does your data look like ?

New Contributor
Posts: 3

Re: deleting incomplete time series

My sample data's at work and I can't access it for now. Each variable is a time series and I have a "date" variable which each observation is a year. Date is like

1950

1951

...

2012

The other time series all start and end at different dates. I just need to get rid of the incomplete ones.

Super User
Posts: 10,046

Re: deleting incomplete time series

It is hard to give you some advice ,since you don't post the sample and the output you need.

Base on my understanding.

if dif(date) ne 1 then delete ;

SAS Employee
Posts: 416

Re: deleting incomplete time series

Hello -

As Ksharp mentioned without seeing the data it is tricky to give advice.

Assuming that you have access to SAS/ETS software - then PROC TIMESERIES might be beneficial.

Here is an example to illustrate what I have in mind.

Thanks,

Udo

*create some test series (6 in total - 3 problematic ones);

data test1;

set sashelp.air;

if date gt "01JAN59"d then delete;

series=1;

run;

data test2;

set sashelp.air;

if date lt "01JAN50"d then delete;

series=2;

run;

data test3;

set sashelp.air;

if date gt "01MAR53"d and date lt "01JUL53"d then delete;

series=3;

run;

data test4;

set sashelp.air;

if date gt "01AUG58"d and date lt "01DEC58"d then delete;

series=4;

run;

data test5;

set sashelp.air;

series=5;

run;

data test6;

set test2;

if date="01AUG53"d then delete;

series=6;

run;

data have;

set test1-test6;

by series;

run;

*I take it that you are interested in flagging series 3,4 and 6 - since they have gaps. All other series are complete - they might be shorter than others, though;

proc timeseries data=have out=_null_ outsum=want(where=(nmiss ne 0) keep=series start end nobs n nmiss);

id date interval=month zeromiss=both;

var air;

by series;

run;

/*

Want data set:

Series START END NOBS N NMISS

3 JAN1949 DEC1960 144 141 3

4 JAN1949 DEC1960 144 141 3

6 JAN1950 DEC1960 132 131 1

*/

Super User
Super User
Posts: 7,079

Re: deleting incomplete time series

I am picturing data with one row per year and 20 variables.

Something like

Year S1 S2 S3

1950 . 100 .

1951 . 102 203

1952 50 103 204

1953 51 . .

So for that data it looks like S1 is recorded from 1952 to 1953, S2 from 1950 to 1952 and S3 from 1951 to 1952.

Do you want to find those boundaries?

Are you worried about data with gaps? For example if S2 was missing in 1951 then you couldn't use 1950 to 1952 because of the gap?

Is this what you are talking about?

Ask a Question
Discussion stats
  • 7 replies
  • 286 views
  • 0 likes
  • 4 in conversation