turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Identify successive observations

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-21-2009 12:20 PM

I want to use only collumns with at least five observations and a gap not greater than two years between observations, so I want to delete the first collumn and only keep the second and thirth.

datalines

1999 1 1 1

2000 1 1 1

2001 1 1 1

2002 . . 1

2003 . . 1

2004 . 1 1

2005 1 1 1

2006 1 . 1

Does annybody know how to do this?

datalines

1999 1 1 1

2000 1 1 1

2001 1 1 1

2002 . . 1

2003 . . 1

2004 . 1 1

2005 1 1 1

2006 1 . 1

Does annybody know how to do this?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-21-2009 04:04 PM

Sorry, but I didn't get it...

Being the above dataset the input data, could you please post the result dataset?

Greetings from Portugal.

Daniel Santos at www.cgd.pt.

Being the above dataset the input data, could you please post the result dataset?

Greetings from Portugal.

Daniel Santos at www.cgd.pt.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-21-2009 04:17 PM

Could this be helpful to you?

data T01_input;

infile cards;

input year col1 col2 col3;

cards;

1999 1 1 1

2000 1 1 1

2001 1 1 1

2002 . . 1

2003 . . 1

2004 . 1 1

2005 1 1 1

2006 1 . 1

;

run;

proc sort data=T01_input;

by year;

run;

%let variables_to_drop=;

data _NULL_;

set T01_input end=end_of_file;

by year;

array gaps {3} _TEMPORARY_;

array gaps_max {3} _TEMPORARY_;

array counts {3} _TEMPORARY_;

array cols {3} col1-col3;

do i=1 to 3;

if cols{i}=. then gaps{i}+1;

else gaps{i}=0;

if gaps{i}>gaps_max{i} then gaps_max{i}=gaps{i};

if cols{i}=1 then counts{i}+1;

end;

if end_of_file

then do i=1 to 3;

variable=vname(cols{i});

max=gaps_max{i};

count=counts{i};

put "Maximum number of consecutive gaps for" variable ": " max "and number of value: " count;

if max gt 2 and count ge 5 then call symput("variables_to_drop",symget('variables_to_drop')||" "||vname(cols{i}));

end;

run;

%put Variables to drop: &variables_to_drop;

proc sql;

create table T02_output as

select *

from T01_input(drop=&variables_to_Drop);

quit;

This will appear in the log:

Maximum number of consecutive gaps forcol1 : 3 and number of value: 5

Maximum number of consecutive gaps forcol2 : 2 and number of value: 5

Maximum number of consecutive gaps forcol3 : 0 and number of value: 8

Variables to drop: col1

The last step would discard col1 from T01_input.

Is it what you expect?

data T01_input;

infile cards;

input year col1 col2 col3;

cards;

1999 1 1 1

2000 1 1 1

2001 1 1 1

2002 . . 1

2003 . . 1

2004 . 1 1

2005 1 1 1

2006 1 . 1

;

run;

proc sort data=T01_input;

by year;

run;

%let variables_to_drop=;

data _NULL_;

set T01_input end=end_of_file;

by year;

array gaps {3} _TEMPORARY_;

array gaps_max {3} _TEMPORARY_;

array counts {3} _TEMPORARY_;

array cols {3} col1-col3;

do i=1 to 3;

if cols{i}=. then gaps{i}+1;

else gaps{i}=0;

if gaps{i}>gaps_max{i} then gaps_max{i}=gaps{i};

if cols{i}=1 then counts{i}+1;

end;

if end_of_file

then do i=1 to 3;

variable=vname(cols{i});

max=gaps_max{i};

count=counts{i};

put "Maximum number of consecutive gaps for" variable ": " max "and number of value: " count;

if max gt 2 and count ge 5 then call symput("variables_to_drop",symget('variables_to_drop')||" "||vname(cols{i}));

end;

run;

%put Variables to drop: &variables_to_drop;

proc sql;

create table T02_output as

select *

from T01_input(drop=&variables_to_Drop);

quit;

This will appear in the log:

Maximum number of consecutive gaps forcol1 : 3 and number of value: 5

Maximum number of consecutive gaps forcol2 : 2 and number of value: 5

Maximum number of consecutive gaps forcol3 : 0 and number of value: 8

Variables to drop: col1

The last step would discard col1 from T01_input.

Is it what you expect?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-22-2009 06:17 AM

Dear Yoba,

this is exactly what I expected, thanks for your help.

greetings from the Netherlands

this is exactly what I expected, thanks for your help.

greetings from the Netherlands

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-22-2009 09:36 AM

You're welcome,

Greetings from ... Belgium (Brussels) ;-)

Greetings from ... Belgium (Brussels) ;-)