Extract Data

Accepted Solution Solved
Reply
Regular Contributor
Posts: 222
Accepted Solution

Extract Data

Hi....I would like to be able to enter a fromdate and todate for a range of date values to extract data from sas datasets with the following names:

Results0405

Results0506

Results0607

Results0708

Final0809

Final0910

Final1011

Current

Each dataset contains data from April 01 thru March 31 for the respective fiscal years and dataset labeled Current contains data for the last 2 years. Any suggestions on how to do this. Thanks in advance.


Accepted Solutions
Solution
‎01-23-2014 09:53 PM
Respected Advisor
Posts: 4,640

Re: Extract Data

I would prefer something lighter such as:

%let fromDate=25AUG2006;

%let toDate=2MAR2010;

data tableRanges;

length tableName $32;

format firstDay lastDay date9.;

input tableName (firstDay lastDay) (:yymmdd.);

datalines;

Project.Results0405  20040401 20050331

Project.Results0506  20050401 20060331

Project.Results0607  20060401 20070331

Project.Results0708  20070401 20080331

Project.Final0809    20080401 20090331

Project.Final0910    20090401 20100331

Project.Final1011    20100401 20110331

Project.Current      20110401 20130331

;

proc sql noprint;

select tableName into :tableList separated by ' '

from tableRanges

where lastDay >= "&fromDate"d and firstDay <= "&toDate"d;

quit;

data want;

set &tableList ;

where claim_date between "&fromDate"d and "&toDate"d;

run;

PG

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,640

Re: Extract Data

Assuming there is a variable in there called date to select from, here is a suggestion :

%let fromDate=25AUG2006;

%let toDate=2MAR2010;

data want;

set Results0405 Results0506 Results0607 Results0708

    Final0809 Final0910 Final1011 Current ;

where date between "&fromDate"d and "&toDate"d;

run;

PG

PG
Super Contributor
Posts: 644

Re: Extract Data

I think the requirement is to insert dates

%Macro Fromdate (dsname) ;

    %If &dsname =   Current %then

        %Let    FromYear    =   11 ;

    %Else

        %Let    FromYear    =   %Substr (%Sysfunc (Compress (&dsname, ,KD)), 1, 2) ;

    %*  NO semiclolon on next line ;

    "01APR&FromYear"D

%Mend ;

%Macro Todate (dsname) ;

    %If &dsname =   Current %then

        %Let    ToYear    =   13 ;

    %Else

        %Let    ToYear    =   %Substr (%Sysfunc (Compress (&dsname, ,KD)), 3, 2) ;

    %*  NO semiclolon on next line ;

    "31MAR&ToYear"D

%Mend ;

%Macro  Insertdates (dsname) ;

    Data    &dsname ;

        Set     &dsname ;

        Format  Fromdate

                ToDate      Date9.

                ;

        Fromdate    =   %Fromdate (&dsname) ;

        Todate      =   %Todate (&dsname) ;

    Run ;

%Mend ;

%Insertdates (Results0405) ;

%Insertdates (Results0506) ;

%Insertdates (Results0607) ;

%Insertdates (Results0708) ;

%Insertdates (Final0809) ;

%Insertdates (Final0910) ;

%Insertdates (Final1011) ;

%Insertdates (Current) ;

Data    want ;

    Set Results0405

        Results0506

        Results0607

        Results0708

        Final0809

        Final0910

        Final1011

        Current

        ;

Run ;

(Untested code - no data provided!)

Ifyou want the dates on the left of the table move the format statement before the set statement ;

Richard

Super User
Posts: 5,254

Re: Extract Data

I think it's a combination of the two.

Given the dynamic dates, a macro first needs to find out which SAS tables that should be queried, and, potentially, applying a where clase on these table (if the query exceeds a fiscal year boundary).

Question is, where to store the result of the query, perhaps in a common result table, which will be used for further analysis?d

As an architect, i'm not very found of storing data in logically separated tables in the first place, this requirement would be much easier to fulfill if all fiscal years resides in the same table.

Data never sleeps
Regular Contributor
Posts: 222

Re: Extract Data

Hi...I have tried the suggestions and tried to incorporate both suggestions together in the code and too sure whether the macros worked but it seems like the Data Step did work. The problem is that the program ran and cycled through each data set. I would like for the program to run so that it starts to cycle through the dataset that contains the "fromdate" and ends the cycling process at the dataset that contains the "todate" since each dataset is quite large.

I have attached the code as well as the log output which shows the running time. Since the datasets either begin with either Results, Final or Current, I have attempted to identify the beginning dates and ending dates for Results, Final and Current. The range for Results is (begperiod2, enperiod2), Final is (begperiod1, curbendperiod1) and Current is (begperiod, endperiod).

I was hoping that the code would loop through somehow that it would be able to identify  only those datasets that is needed bases on the fromdate and todate.

%include pin.sas
%libname project = 'Y:\PIN Models';

%let fromDate=20060806;
%let toDate=20100302;

DATA _NULL_;
  begperiod = input(strip(put(intnx('Month',intnx('Month',today(),-1,'B'),-23,'B'),yymmddn8.)),8.); /* 20120101 */
  endperiod = input(strip(put(intnx('Month',today(),-1,'E'),yymmddn8.)),8.); /* 20131231 */

  begperiod1 = 20080401;
  endperiod1 = input(strip(put(intnx('Month',intnx('Month',today(),-1,'E'),-24,'E'),yymmddn8.)),8.); /* 20111231 */
  curbbegyear = input(substr(put(intnx('Year.4',today(),0,'B'),yymmddn8.),3,2),2.); /* 13 */
  curendyear = input(substr(put(intnx('Year.4',today(),0,'E'),yymmddn8.),3,2),2.); /* 14 */
  curbbegperiod1 = input(strip(put(intnx('Year.4',today(),-2,'B'),yymmddn8.)),8.); /*20110401 */
  curbendperiod1 = input(strip(put(intnx('Year.4',today(),-2,'E'),yymmddn8.)),8.); /* 20120331 */
  finalbegyear = input(substr(strip(put(intnx('Year.4',today(),-2,'B'),yymmddn8.)),3,2),2.); /* 11 */
  finalendyear = input(substr(strip(put(intnx('Year.4',today(),-2,'E'),yymmddn8.)),3,2),2.); /* 12 *

  begperiod2 = 20040401;
  endperiod2 = 20080331;

CALL Symput ('finalbegyear', finalbegyear);
CALL Symput ('curendyear', curendyear);

run;

%Macro Fromdate (dsname);
    %If &dsname = Current %then
        %Let FromYear = &finalbegyear.;
    %Else
        %Let FromYear = %Substr(%Sysfunc(Compress(&dsname, ,KD)),1,2);
    %*  NO semiclolon on next line ;
"20&FromYear.0401"
%Mend;

%Macro Todate (dsname);
    %If &dsname = Current %then
        %Let ToYear = &curendyear.;
    %Else
        %Let ToYear = %Substr(%Sysfunc(Compress(&dsname, ,KD)),3,2);
    %*  NO semiclolon on next line ;
"20&ToYear.0331"
%Mend;

%Macro Insertdates (dsname);

Data &dsname;
Set &dsname;
  Format Fromdate ToDate YYMMDDN8.;
        Fromdate = %Fromdate (&dsname);
        Todate = %Todate (&dsname);
Run;
%Mend;

%Insertdates (Results0405);
%Insertdates (Results0506);
%Insertdates (Results0607);
%Insertdates (Results0708);
%Insertdates (Final0809);
%Insertdates (Final0910);
%Insertdates (Final1011);
%Insertdates (Final1112);
%Insertdates (Final1213);
%Insertdates (Current);

DATA want;
SET project.Results0405
        project.Results0506
        project.Results0607
        project.Results0708
        project.Final0809
        project.Final0910
        project.Final1011
        project.Final1112
        project.Final1213
        pin.Current;
where (put(claim_date,8.) between "&fromdate" and "&todate");
RUN;

The log output results:

NOTE: There were 0 observations read from the data set PROJECT.Results0405.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 0 observations read from the data set PROJECT.Results0506.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 7836338 observations read from the data set PROJECT.Results0607.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 12340237 observations read from the data set PROJECT.Results0708.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 13041240 observations read from the data set PROJECT.Final0809.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 12446178 observations read from the data set PROJECT.Final0910.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 0 observations read from the data set PROJECT.Final1011.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 0 observations read from the data set PROJECT.Final1112.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 0 observations read from the data set PROJECT.Final1213.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: There were 0 observations read from the data set PIN.Current.

      WHERE (PUT(claim_date, 8.)>='20060806' and PUT(claim_date, 8.)<='20100302');

NOTE: The data set WORK.WANT has 45663993 observations and 61 variables.

NOTE: DATA statement used (Total process time):

      real time           1:47:11.57

      cpu time            8:27.48

Solution
‎01-23-2014 09:53 PM
Respected Advisor
Posts: 4,640

Re: Extract Data

I would prefer something lighter such as:

%let fromDate=25AUG2006;

%let toDate=2MAR2010;

data tableRanges;

length tableName $32;

format firstDay lastDay date9.;

input tableName (firstDay lastDay) (:yymmdd.);

datalines;

Project.Results0405  20040401 20050331

Project.Results0506  20050401 20060331

Project.Results0607  20060401 20070331

Project.Results0708  20070401 20080331

Project.Final0809    20080401 20090331

Project.Final0910    20090401 20100331

Project.Final1011    20100401 20110331

Project.Current      20110401 20130331

;

proc sql noprint;

select tableName into :tableList separated by ' '

from tableRanges

where lastDay >= "&fromDate"d and firstDay <= "&toDate"d;

quit;

data want;

set &tableList ;

where claim_date between "&fromDate"d and "&toDate"d;

run;

PG

PG
Regular Contributor
Posts: 222

Re: Extract Data

Hi PG.....Thanks for your help.....with a slight modification the the date format, it works perfect.

Regular Contributor
Posts: 222

Re: Extract Data

Hi PG,

I have one more question....I would like to modify this part of the code so that it checks to see if the claim_date is between FirstDay and LastDay rather than From Date and ToDate. I tried to and wanted to extract the FirstDay and LastDay from the TableRanges but get the Error Message below.

data want;

set &tableList ;

where claim_date between "&fromDate"d and "&toDate"d;

run;

ERROR: Variable FirstDay is not on file PROJECT.FINAL1011.
ERROR: Variable FirstDay is not on file PROJECT.FINAL1112.
ERROR: Variable FirstDay is not on file PIN.CURRENT.

Respected Advisor
Posts: 4,640

Re: Extract Data

Looks like you want to do:

proc sql noprint;

select tableName into :tableList separated by ' '

from tableRanges

where lastDay >= "&fromDate"d and firstDay <= today();

quit;

data want;

set &tableList ;

where claim_date between "&fromDate"d and today();

run;

 

PG

PG
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 441 views
  • 5 likes
  • 4 in conversation