DATA Step, Macro, Functions and more

deleting and keeping observation in tables

Reply
Frequent Contributor
Posts: 76

deleting and keeping observation in tables

 

Hi,

Pass and fail information is extracting every day from source system table called X to table A

Now Table A has 10+ years (increasing day by day) history (result variable has pass and fail info) so

 

1) I want to delete all the fail information and keep pass information all the time from Table A and want to keep only 6 months information of fails in table A every time.(pass info all the time and fails only for 6 months in table A)

data a B;
set A;
if Result="F" then output b ;
 else if  Result="F" and dt>=intnx('month',today(),-6,'s')then  output a ;

I think I am missing Pass information here.

2) And save all deleted fail information in table B and I want to keep only 4 years of fail information all the time in table B.

logic is not correct ,I think

 

data B;
set b;
if Result="F" and dt>=intnx('year',today(),-4,'s');
run;

This process should be automated every day.

Thanks,

SS

Super User
Posts: 13,321

Re: deleting and keeping observation in tables

I think I am missing Pass information here.

Yes. And if this code is actually what you ran:

 

data a B;
set A;
if Result="F" then output b ;
 else if  Result="F" and dt>=intnx('month',today(),-6,'s')then  output a ;

You will have to start over, hopefully with a backup of data set A. You possibly just created an empty data set A since there are no "else if Result="F" and " records. All the F went to b. I would check your log or other method to see if A has zero observations.

 

 

Use of the code structure is very risky as it will replace the data set A. If you have an incorrect record selection then you will have records gone and you hopefully can rebuild A from something.

data A;
   set A;

 

I suspect it might help to use some actually descriptive data set names as you are throwing A and B around a lot and the words of your description are a little confusing because of repeated use of the same reference but should in actuality be different data sets.

May be something like:

Data failall fail4year fail6month pass;
   set longtermdata;
   if result='F' then output failall;
   if result='F' and dt ge intnx('month',today(),-6,'S') then output fail6month;
   if result='F' and dt ge intnx('year',today(),-4,'S') then output fail4year;
   if result ne 'F'  then output pass;
run;
Frequent Contributor
Posts: 76

Re: deleting and keeping observation in tables

sorry for the confusion. I want to keep 6 months fail information and all pass information in A table.
and I want to keep 4 years of fail information in Table B.
we can build the logic as said but I need to set this logic in A and B
Respected Advisor
Posts: 4,673

Re: deleting and keeping observation in tables

[ Edited ]

@sathya66

What I would be doing to avoid constant table growth and performance degradation while still having all the data available:

- create monthly slices (separate tables per month)

- create views over the monthly slices to support the most common queries

 

To get there you would now have to split your big table into monthly tables and create the views one off process.

You would have to change your load process to create and load further monthly slices and update the views as an ongoing process.

 

To give you an idea how this could look like, below some code which splits a big table into monthly tables and then creates a view over some of the monthly slices. The code is dynamic and would work for any dates in the source table.


/** create sample HAVE data **/
data have;
  format dt date9.;
  do dt=intnx('year',today(),-10) to today();
    status='fail'; output;
    status='pass'; output;
  end;
run;

/** split sample data into monthly tables **/
proc sql;
  create table _dist_months as
  select distinct(put(dt,yymmn.6 -l)) as yymm length=6
  from have
  ;
quit;

/* generate code to split the data */
filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;

  put @1 'data';
  do i=1 to nobs;
    set _dist_months nobs=nobs;
    put @3 'Fail_' yymm 'Pass_' yymm;
  end;
  put @3  ';';

  put @3 'set have;';

  put @1 'if status="fail" then' /
      @3 'do;' ;
  do i=1 to nobs;
    set _dist_months nobs=nobs;
    if i=1 then
      do;
        put @5 'if put(dt,yymmn6.)="' yymm +(-1) '" then output Fail_' yymm ';';
      end;
    else
      do;
        put @5 'else if put(dt,yymmn6.)="' yymm +(-1) '" then output Fail_' yymm ';';
      end;
  end;
  put @3 'end;';

  put @1 'else if status="pass" then' /
      @3 'do;' ;
  do i=1 to nobs;
    set _dist_months nobs=nobs;
    if i=1 then
      do;
        put @5 'if put(dt,yymmn6.)="' yymm +(-1) '" then output Pass_' yymm ';';
      end;
    else
      do;
        put @5 'else if put(dt,yymmn6.)="' yymm +(-1) '" then output Pass_' yymm ';';
      end;
  end;
  put @3 'end;';

  put @1 'run;';

  stop;

run;

/* execute generated code */
%include codegen /source2;


/** Create views over monthly slices **/
/* as an example: view V_Fail6 for last 6 months of fails */
proc sql;
  create table dslist as
    select libname, memname
    from dictionary.tables
    where libname='WORK' and scan(memname,1,'_') = 'FAIL'
    order by memname DESC;
  ;
quit;

/* generate the code */
filename codegen temp;
data _null_;
  file codegen;
  set dslist end=last;
  if _n_=1 then
    do;
      put
        'proc sql;' /
        '  create view V_Fail6 as' /
        '    select * from ' libname +(-1)'.' memname
        ;
    end;
  else
    put '    union corr all select * from ' libname +(-1)'.' memname;
  if last or _n_=6 then 
    do;
      put 
        '  ;' /
        'quit;' 
        ;
      stop;
    end;
run;

/* execute the code */
%include codegen / source2;

 

Creating monthly slices has also the advantage that you never will have to delete records in these tables. You just insert/append new records.

For archiving purposes: You just zip, move, delete or whatever monthly slices (=files on disk).

 

Ask a Question
Discussion stats
  • 3 replies
  • 102 views
  • 1 like
  • 3 in conversation