BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
inid
Obsidian | Level 7

I have a data set structured like below. 

 

IDDateStatus
x1/31/20170
x1/31/20170
x2/28/20171
y1/31/20170
y2/28/20171
z1/31/20170
z1/31/20170

 

I want to get the maximum value of column status 1 year from actual date. to do this i am using proc sql but it isn't functioning as it should (i am quite new to this). 

 

what i want to get is:

IDDateStatusStatus after 1 year
112/31/201600
11/31/201701
11/31/20181 

 

what i get:

IDDateStatusStatus after 1 year
112/31/201601
11/31/201701
11/31/201811

 

The code i  have used:

Proc sql;

Select Customer ID, Date, Status, Max (Status) as Status after 1 year

from db t1

where Date between t1.Date and t1.date + 365

group by Customer ID, Date

 

thank you in advance. is there any way to do this without proc sql? if not could you please provide any hint on how to proceed? I am using sas enterprise guide 7.1

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@inid 

Nothing to be sorry about. Below code should return what you're after.

The yrdif() function as I've used it calculates the AGE between two dates. As I understand you this should be something between 0 and exactly 1 (if the end date is one day later yrdif() would return a value of 1 with some decimals).

data have;
 infile datalines truncover dlm=' ' dsd;
 input ID Observation_date:anydtdte. Status;
 format Observation_date date9.;
 datalines;
1 31/12/2016 0
1 31/01/2017 0
1 31/12/2017 1
1 31/01/2018 1
1 31/03/2018 2
1 31/01/2020 3
;
run;

proc sql;
  select distinct
    l.*, 
    max(r.status) as max_status_within_year
  from 
    have l
      left join
    have r
      on 
        l.id=r.id 
        and yrdif(l.Observation_date,r.Observation_date) between 0 and 1
/*        and yrdif(intnx('month',l.Observation_date,0,'b'),intnx('month',r.Observation_date,0,'b')) between 0 and 1*/
/*        and intck('month',l.Observation_date,r.Observation_date) between 0 and 12*/
  group by l.id,l.Observation_date
  having max(r.status)=r.status
  ;
quit;

The first alternative condition in comments first aligns the two dates to the beginning of the month and only then calculates the AGE. 

The 2nd alternative condition calculates the number of moths boundaries between two dates (31Jan2017, 01Feb2017 would return 1).

View solution in original post

13 REPLIES 13
ChrisNZ
Tourmaline | Level 20

Welcome to the forums!

1. Your input and output dates are different

2. Your input and output IDs are different

3. Post input data as a data step

How to convert datasets to data steps   (You can use this macro)
How to post code

4. What does "1 year from actual date" mean? On the very same date a year later? Is there an interval to consider?

inid
Obsidian | Level 7

Hi. thank you for the comment. 

The initial data look something like below:

 

IDObservation DateStatus
112/31/20160
11/31/20170
112/31/20170
11/31/20181

 

For each ID and for each data date (i have monthly data) i have their status. I want to add a column where it is shown the maximum value of  Status 1 year from the observation date. For example in the first observation I want to know ID=1 status 1 year from observation date which would be 12/31/2017 and would be equal to 0. The same for the second observation i want to know its status 1 year from the 1/31/2017 so until 1/31/2018 and in this case it would be equal to 1. 

1 year from now would mean 370 days added to observation date (considering that the monthly date is always on end of month but may vary with 1-2 days..

 

What I would like to have as a result is:

IDDateStatusStatus after 1 year
112/31/201600
11/31/201701
112/31/201701
11/31/201811

for the last 2 observations even though 1 full year has not i want to have the maximum value of the available dates. 

ScottBass
Rhodochrosite | Level 12

I pasted this into SAS:

 

ID	Observation Date	Status
1	12/31/2016	0
1	1/31/2017	0
1	12/31/2017	0
1	1/31/2018	1

but I get errors. 

 

Edit your post so we can cut-and-paste your code into SAS.  Don't make us do your work converting your post into valid SAS code.

 

Same goes for your desired results.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
inid
Obsidian | Level 7

ID            Observation_date           Status

1              31/12/2016         0

1              31/01/2017         0

1              31/12/2017         0

1              31/01/2018         1

 

Sorry I did not have access to SAS when I wrote the issue. 

this is the data copy pasted from SAS data. 

 

The code I use is Below:

Proc sql;

   SELECT

          ID,OBSERVATION_DATE, Status, MAX(Status) FORMAT=BEST4. AS Status_after_1_year

      FROM work.datatest t1

       where   Observation_date between t1.Observation_date  and t1.Observation_date+365

       GROUP BY ID

           oRDER BY Observation_date asc;

QUIT;

 

the results i get:

ID

Observation_date

Status

Status_after_1_year

1

31/12/2016

0

1

1

31/01/2017

0

1

1

31/12/2017

0

1

1

31/01/2018

1

1

 In the first  observation i want the  "Status_after_1_year" to be =0 as the maximum value 1 is reached only on 31 January 2018 and therefore the value=1 should start at the second observation. 

 

I look forward to your feedback. 

thank you in advance

PeterClemmensen
Tourmaline | Level 20

And you want to do so without PROC SQL, correct?

inid
Obsidian | Level 7
No with Proc SQL is fine as well.
PeterClemmensen
Tourmaline | Level 20

Ok. Here is a data step approach, that gives you what you want

 

data have;
input ID Observation_date:ddmmyy10. Status;
format Observation_date ddmmyy10.;
datalines;
1 31/12/2016 0
1 31/01/2017 0
1 31/12/2017 0
1 31/01/2018 1
2 31/12/2016 0
2 31/01/2017 0
2 31/12/2017 0
2 31/01/2018 1
;

data want(drop=max_date);
   do until (last.id);
      set have;
      by id;
      if Observation_date ge Max_date then Max_date=Observation_date;
   end;

   do until (last.id);
      set have;
      by id;
      Status_after_1_year=0;
      if intck('day', Observation_date, Max_date) le 365 then
         Status_after_1_year=1;
      output;
   end;
run;

This gives

 

 

Capture.PNG

inid
Obsidian | Level 7

Maybe I was not clear but this does not work from me since i want to know the maximum value column "status" has and not just the difference between the two dates. For example if the data would be as below:

ID            Observation_date           Status

1              31/12/2016         0

1              31/01/2017         0

1              31/12/2017         1

1              31/01/2018         2

then the desired output wold be:

ID

Observation_date

Status

Status_after_1_year

1

31/12/2016

0

1

1

31/01/2017

0

2

1

31/12/2017

0

2

1

31/01/2018

1

2

Patrick
Opal | Level 21

@inid 

To better derive the rules you want to implement can you please show us how the desired result would look like for below sample data.

data have;
 infile datalines truncover dlm=' ' dsd;
 input ID Observation_date:anydtdte. Status;
 format Observation_date date9.;
 datalines;
1 31/12/2016 0
1 31/01/2017 0
1 31/12/2017 1
1 31/01/2018 1
1 31/03/2018 2
1 31/01/2020 3
;
run;
inid
Obsidian | Level 7

Hi. the results woul look like below: (Considering that in my database the end of month date is not always the same i want the difference between the two observation date considered to be <=370. )

IDObservation_dateStatusStatus_after_1_yearLogic
131/12/201601Maximum Value of Status from 31/12/2016 to 31/12/2017
131/01/201701Maximum Value of Status from 31/01/2017 to 31/01/2018
131/12/201712Maximum Value of Status from 31/12/2017 to 31/12/2018
131/01/201812Maximum Value of Status from 31/01/2018 to 31/01/2019
231/03/201822Maximum Value of Status from 31/03/2018 to 31/03/2019
331/01/202033Maximum Value of Status from 31/01/2020 to 31/01/2021 
Patrick
Opal | Level 21

@inid 

Sorry for continuing asking instead of providing a solution. I just try to get full clarity first.

Soo... about the 370 days and "Considering that in my database the end of month date is not always the same":

Are you aware that SAS provides calendar functions which would make it very simple to shift any date to the beginning or end of month before a date comparison? There is also a function which could just return the number of years or months or ... between two dates. 

Could any of this be useful for your task? Would shifting the dates prior to comparing them help?

inid
Obsidian | Level 7
I am sorry I am quite new to sas.
It would definitely help t have the end of month date and i was not aware i could achieve it. Never the less my main concern would be get the maximum value of the given column 1 year from the observation date of each row for each ID.
Thank you for the time spent.
Patrick
Opal | Level 21

@inid 

Nothing to be sorry about. Below code should return what you're after.

The yrdif() function as I've used it calculates the AGE between two dates. As I understand you this should be something between 0 and exactly 1 (if the end date is one day later yrdif() would return a value of 1 with some decimals).

data have;
 infile datalines truncover dlm=' ' dsd;
 input ID Observation_date:anydtdte. Status;
 format Observation_date date9.;
 datalines;
1 31/12/2016 0
1 31/01/2017 0
1 31/12/2017 1
1 31/01/2018 1
1 31/03/2018 2
1 31/01/2020 3
;
run;

proc sql;
  select distinct
    l.*, 
    max(r.status) as max_status_within_year
  from 
    have l
      left join
    have r
      on 
        l.id=r.id 
        and yrdif(l.Observation_date,r.Observation_date) between 0 and 1
/*        and yrdif(intnx('month',l.Observation_date,0,'b'),intnx('month',r.Observation_date,0,'b')) between 0 and 1*/
/*        and intck('month',l.Observation_date,r.Observation_date) between 0 and 12*/
  group by l.id,l.Observation_date
  having max(r.status)=r.status
  ;
quit;

The first alternative condition in comments first aligns the two dates to the beginning of the month and only then calculates the AGE. 

The 2nd alternative condition calculates the number of moths boundaries between two dates (31Jan2017, 01Feb2017 would return 1).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 4364 views
  • 1 like
  • 5 in conversation