Identifying each observation with peak and trough period

Accepted Solution Solved
Reply
Contributor
Posts: 51
Accepted Solution

Identifying each observation with peak and trough period

Hi,

I have company Id and event period in file 1 and business cycle- peak and trough in the the file 2. I want to create a column in file 1 named Cycle which will take name peak if the event date in file 1 is within the peak date range in file 2; and trough if the event date in file 1 is within the trough date range in file 2. I know I can use if function to do it. But the original size of business cycle has many dates for peak. So it is difficult to put the date specifically in the if function. Could anyone please tell me how I can do the job easily without using if function? Please see the descriptions of the variables in each file before you run any code.

 

File 2:

Peak variable is the beginning of peak season

Trough variable is the beginning of trough season

So, peak season starts from peak variable and ends at one day before the trough date and trough season starts from trough date and ends at one day before next the peak date.

 

input peak trough;
02/01/1980 10/01/1986
05/01/1995 12/01/2005
01/01/2007 12/01/2015

 

File 1:

id is company id

date is the event date


input id date;
1 10/01/1980
1 12/10/1990
1 09/11/2000
1 06/12/2010
2 08/20/1980
2 12/11/1990
2 08/07/2000
3 05/06/1990
3 02/03/2010
5 05/11/1990
5 11/10/2010

I am sorry I could not put both of the files in sas code format as I do not know how to do it for files with date variables.

 

The output file should look like the following:

Picture1.png

 


Accepted Solutions
Solution
‎05-28-2017 12:00 AM
Grand Advisor
Posts: 17,325

Re: Identifying each observation with peak and trough period

Here's some starter code, I decided that PROC FORMAT was a better way to implement this, especially if you're not familiar with SQL. 

Here's a reference paper on creating formats. 

http://www2.sas.com/proceedings/forum2007/068-2007.pdf

 

I haven't tested the boundary values, I'll leave that to you. However, it works exactly for your sample data.

 

data intervals;
    informat peak trough mmddyy10.;
    format peak trough date9.;
    input peak trough;
    cards;
02/01/1980 10/01/1986
05/01/1995 12/01/2005
01/01/2007 12/01/2015
;
run;



data date_fmt;
    set intervals;
    
    format start end date9.;
    length label $12.;
    
    fmtname = 'date_fmt';
    type = 'N';
    
    lag_trough=lag(trough);

    if _n_=1 then
        do;
            label='peak';
            start=peak;
            end=trough - 1;
            output;
        end;
    else
        do;
            label='trough';
            start=lag_trough;
            end=peak - 1;
            output;
            label='peak';
            start=peak;
            end=trough - 1;
            output;
        end;
        
    keep label start end fmtname type;
run;

proc format cntlin=date_fmt;
run;

data have;
informat id $1. date mmddyy10.;
format date date_fmt.;
input id date;
cards;
1 10/01/1980
1 12/10/1990
1 09/11/2000
1 06/12/2010
2 08/20/1980
2 12/11/1990
2 08/07/2000
3 05/06/1990
3 02/03/2010
5 05/11/1990
5 11/10/2010
;
run;

proc print data=have;
format date date_fmt.;
run;

View solution in original post


All Replies
Grand Advisor
Posts: 17,325

Re: Identifying each observation with peak and trough period

You can create sample data using instructions here:

 

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

This is for a single set of data as illustrated? You don't need to extend it beyond this for multiple groups or anything? 

If so, you can merge results with a SQL JOIN rather than IF condition. 

 

It iwould be easier if you changed the peak/trough data set to be:

 

type date_start date_end

Peak 01jan2017 31jan2017

trough 01Feb2017 28Feb2017

 

 

Contributor
Posts: 51

Re: Identifying each observation with peak and trough period

Thank you for your response. I do not get the code that you wanted to recommend. Could you please give me the code?

Solution
‎05-28-2017 12:00 AM
Grand Advisor
Posts: 17,325

Re: Identifying each observation with peak and trough period

Here's some starter code, I decided that PROC FORMAT was a better way to implement this, especially if you're not familiar with SQL. 

Here's a reference paper on creating formats. 

http://www2.sas.com/proceedings/forum2007/068-2007.pdf

 

I haven't tested the boundary values, I'll leave that to you. However, it works exactly for your sample data.

 

data intervals;
    informat peak trough mmddyy10.;
    format peak trough date9.;
    input peak trough;
    cards;
02/01/1980 10/01/1986
05/01/1995 12/01/2005
01/01/2007 12/01/2015
;
run;



data date_fmt;
    set intervals;
    
    format start end date9.;
    length label $12.;
    
    fmtname = 'date_fmt';
    type = 'N';
    
    lag_trough=lag(trough);

    if _n_=1 then
        do;
            label='peak';
            start=peak;
            end=trough - 1;
            output;
        end;
    else
        do;
            label='trough';
            start=lag_trough;
            end=peak - 1;
            output;
            label='peak';
            start=peak;
            end=trough - 1;
            output;
        end;
        
    keep label start end fmtname type;
run;

proc format cntlin=date_fmt;
run;

data have;
informat id $1. date mmddyy10.;
format date date_fmt.;
input id date;
cards;
1 10/01/1980
1 12/10/1990
1 09/11/2000
1 06/12/2010
2 08/20/1980
2 12/11/1990
2 08/07/2000
3 05/06/1990
3 02/03/2010
5 05/11/1990
5 11/10/2010
;
run;

proc print data=have;
format date date_fmt.;
run;
Contributor
Posts: 51

Re: Identifying each observation with peak and trough period

Thank You very much. Your were very helpful

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 160 views
  • 0 likes
  • 2 in conversation