Re: Dropping rolling fields off monthly dataset?

deleted_user · Posted 01-18-2008 01:26 PM

It has to be Friday before a three day weekend, because I am completely drawing a blank on this -- It should be relatively easy to do, but I'm not seeing it...

Say I have a dataset that looks like this last month:

DATA SUBSET;
.....
FX200601='xxxx'; FX200701='xxxx';
FX200602='xxxx'; FX200702='xxxx';
FX200603='xxxx'; FX200703='xxxx';
FX200604='xxxx'; FX200704='xxxx';
FX200605='xxxx'; FX200705='xxxx';
FX200606='xxxx'; FX200706='xxxx';
FX200608='xxxx'; FX200708='xxxx';
FX200609='xxxx'; FX200709='xxxx';
FX200610='xxxx'; FX200710='xxxx';
FX200611='xxxx'; FX200711='xxxx';
FX200612='xxxx'; FX200712='xxxx';

This month, new field FX200801='xxxx' will be added to the dataset, but I want to delete off the 25th prior field (FX200601).. In essence I only want the last 24 occurances.. The field names will always have the 4 digit year and month in the name.

Any ideas on how to do this the quickest & easiest way?

Thanks
-=James

Report Inappropriate Content · Posted 01-18-2008 05:05 PM

I don't know if this is the quickest or easiest, but I'll take a shot . . .

These look like variables (i.e. columns) in a dataset. I would consider using PROC CONTENTS to get a list of the variable names. Filter this list to get the variables that start with 'FX200%'. Then sort descending and put the first 24 occurrences into a macro variable. You can use this macro variable to reference the 24 most recent columns.

For example:
DATA SUBSET;
FX200601='xxxx'; FX200701='xxxx'; FX200801='xxxx';
FX200602='xxxx'; FX200702='xxxx';
FX200603='xxxx'; FX200703='xxxx';
FX200604='xxxx'; FX200704='xxxx';
FX200605='xxxx'; FX200705='xxxx';
FX200606='xxxx'; FX200706='xxxx';
FX200607='xxxx'; FX200707='xxxx';
FX200608='xxxx'; FX200708='xxxx';
FX200609='xxxx'; FX200709='xxxx';
FX200610='xxxx'; FX200710='xxxx';
FX200611='xxxx'; FX200711='xxxx';
FX200612='xxxx'; FX200712='xxxx';
run;

proc contents data=subset out=contents noprint;
run;

proc sort data=contents (keep=NAME where=(name like 'FX200%')); by descending name; run;

proc sql noprint;
select name into :varlist separated by " "
from contents (obs=24);
quit;

data newset;
set subset (keep=&varlist);
run;

deleted_user · Posted 01-21-2008 10:08 AM

just another way ...
since "latest" is defined as the one to add, and the one to remove is two years before, a small amount of work with macro variables should reduce the issue....
(assuming latest is last month)[pre]
%let latest = %sysfunc( intnx( month, "&sysdate9"d, -1), yymmN6 );
%let remove = %sysfunc( intnx( month, "&sysdate9"d, -25), yymmN6 ); [/pre]
Write code so that the new column are added "after" existing columns, then the data steps can use an array statement to refer to the whole picture. [pre]
option symbolgen ;
data new_data_set ;
set old_data_set( drop= FX&remove );
fx&latest = ....some formula... ;
array fxs(24) fx: ; *all columns beginning FX;
*avoid unwanted columns having the FX prefix;
rolling_mean24 = mean( of fxs(*) );
run;
[/pre]

Good Luck

PeterC

deleted_user · Posted 01-22-2008 01:04 PM

Thank you 1162 and Peter - that is very helpful!

I went with 1162's way due to me seeing it first, but appreciate the help!
-=James

Dropping rolling fields off monthly dataset?