Hi, greetings!
I appreciate your presence, because you are always available whenever I need a help.
Thank you so much in advance, and the problem which is troubling me is the following:
I have a sort of brokerage houses' announcements data on a daily basis.
House_id | Annc_dates | House_close |
1 | 7/1/1996 | 0 |
1 | 8/21/1996 | 0 |
1 | 9/4/1996 | 0 |
1 | 10/1/1996 | 0 |
1 | 11/1/1996 | 0 |
1 | 12/26/1996 | 0 |
1 | 1/28/1997 | 0 |
1 | 2/24/1997 | 0 |
1 | 3/21/1997 | 0 |
1 | 4/29/1997 | 0 |
1 | 5/15/1997 | 0 |
1 | 8/14/1997 | 0 |
1 | 11/3/1997 | 1 |
…......... | ||
5 | 4/20/2000 | 0 |
5 | 5/12/2000 | 0 |
5 | 6/5/2000 | 0 |
5 | 10/12/2000 | 0 |
5 | 11/9/2000 | 0 |
5 | 12/14/2000 | 0 |
5 | 1/26/2001 | 0 |
5 | 2/27/2001 | 0 |
5 | 3/6/2001 | 0 |
5 | 5/23/2001 | 0 |
5 | 7/31/2001 | 0 |
5 | 9/11/2001 | 1 |
6 | 11/11/1987 | 0 |
6 | 12/3/1987 | 1 |
What I want to do, is to highlight one-year time span of observations before a house closes (that is, from the last date of a house in the data). For this, I created a variable "House_close" with 1 for the closing (last) dates, and 0 otherwise.
What I want to obtain is like the below:
House_id | Annc_dates | House_close | Span_closing |
1 | 7/1/1996 | 0 | 0 |
1 | 8/21/1996 | 0 | 0 |
1 | 9/4/1996 | 0 | 0 |
1 | 10/1/1996 | 0 | 0 |
1 | 11/1/1996 | 0 | 1 |
1 | 12/26/1996 | 0 | 1 |
1 | 1/28/1997 | 0 | 1 |
1 | 2/24/1997 | 0 | 1 |
1 | 3/21/1997 | 0 | 1 |
1 | 4/29/1997 | 0 | 1 |
1 | 5/15/1997 | 0 | 1 |
1 | 8/14/1997 | 0 | 1 |
1 | 11/3/1997 | 1 | 1 |
…......... | |||
5 | 4/20/2000 | 0 | 0 |
5 | 5/12/2000 | 0 | 0 |
5 | 6/5/2000 | 0 | 0 |
5 | 10/12/2000 | 0 | 1 |
5 | 11/9/2000 | 0 | 1 |
5 | 12/14/2000 | 0 | 1 |
5 | 1/26/2001 | 0 | 1 |
5 | 2/27/2001 | 0 | 1 |
5 | 3/6/2001 | 0 | 1 |
5 | 5/23/2001 | 0 | 1 |
5 | 7/31/2001 | 0 | 1 |
5 | 9/11/2001 | 1 | 1 |
6 | 11/11/1987 | 0 | 1 |
6 | 12/3/1987 | 1 | 1 |
As you see, I created a new variable "Span_closing." For House_id = 1, the closing date is 11/3/1997. So, tracking all the way back to one year ago, the closest date is 11/1/1996. Over that one-year time span, I entered Span_closing = 1, otherwise = 0.
In case of House_id = 6, although it has only two observations, I entered Span_closing = 1 for all its obs based on the indicated principle.
I ask for your wisdom!
Sincerely,
KS -,
data have;
input House_id Annc_dates :mmddyy10. House_close;
format Annc_dates mmddyy10.;
datalines;
1 7/1/1996 0
1 8/21/1996 0
1 9/4/1996 0
1 10/1/1996 0
1 11/1/1996 0
1 12/26/1996 0
1 1/28/1997 0
1 2/24/1997 0
1 3/21/1997 0
1 4/29/1997 0
1 5/15/1997 0
1 8/14/1997 0
1 11/3/1997 1
5 4/20/2000 0
5 5/12/2000 0
5 6/5/2000 0
5 10/12/2000 0
5 11/9/2000 0
5 12/14/2000 0
5 1/26/2001 0
5 2/27/2001 0
5 3/6/2001 0
5 5/23/2001 0
5 7/31/2001 0
5 9/11/2001 1
6 11/11/1987 0
6 12/3/1987 1
;
data want(drop = a);
do until (last.House_id);
set have;
by House_id;
if House_close = 1 then a = Annc_dates;
end;
do until (last.House_id);
set have;
by House_id;
Span_closing = (Annc_dates >= intnx('year', a, -1, 's'));
output;
end;
run;
Why do you want 11/1/1996 to be included? It is not part of the 1 year span?
It does.
Result
House_id Annc_dates House_close Span_closing 1 07/01/1996 0 0 1 08/21/1996 0 0 1 09/04/1996 0 0 1 10/01/1996 0 0 1 11/01/1996 0 0 1 12/26/1996 0 1 1 01/28/1997 0 1 1 02/24/1997 0 1 1 03/21/1997 0 1 1 04/29/1997 0 1 1 05/15/1997 0 1 1 08/14/1997 0 1 1 11/03/1997 1 1 5 04/20/2000 0 0 5 05/12/2000 0 0 5 06/05/2000 0 0 5 10/12/2000 0 1 5 11/09/2000 0 1 5 12/14/2000 0 1 5 01/26/2001 0 1 5 02/27/2001 0 1 5 03/06/2001 0 1 5 05/23/2001 0 1 5 07/31/2001 0 1 5 09/11/2001 1 1 6 11/11/1987 0 1 6 12/03/1987 1 1
Please, anyone from SAS community provide me a solution, if you are available.
Dear PeterClemmensen,
In relation to my previous task, I again bumped into an obstacle.
Now, I proc sorted my dataset by cusip(=firm id), Analyst_id, Annc_date, and Span_closing.
Span_closing is the final 1-year span before a house closes.
cusip | House_id | Analyst_id | Annc_date | Span_closing |
00030710 | 2143 | 134474 | 2/23/2016 | 0 |
00030710 | 2143 | 134474 | 3/18/2016 | 0 |
00030710 | 2143 | 134474 | 8/5/2016 | 1 |
00030710 | 2143 | 134474 | 11/4/2016 | 1 |
00030710 | 2143 | 134474 | 2/27/2017 | 1 |
00030710 | 2143 | 134474 | 3/1/2017 | 1 |
00030710 | 3945 | 165909 | 10/23/2015 | 1 |
00030710 | 3945 | 165909 | 2/5/2016 | 1 |
00030710 | 3945 | 165909 | 2/24/2016 | 1 |
00032Q10 | 926 | 194378 | 8/30/2021 | 1 |
00032Q10 | 926 | 194378 | 11/11/2021 | 1 |
00036020 | 371 | 1112 | 3/22/1994 | 0 |
00036020 | 371 | 1112 | 6/7/1994 | 0 |
00036020 | 371 | 1112 | 11/21/1994 | 0 |
00036110 | 32 | 136 | 6/29/1989 | 0 |
00036110 | 32 | 136 | 1/4/1990 | 0 |
00036110 | 32 | 136 | 3/29/1990 | 1 |
00036110 | 32 | 136 | 5/10/1990 | 1 |
00036110 | 32 | 136 | 8/2/1990 | 1 |
00036110 | 32 | 136 | 8/15/1990 | 1 |
00036110 | 32 | 136 | 8/30/1990 | 1 |
00036110 | 32 | 136 | 10/24/1990 | 1 |
00036110 | 32 | 136 | 10/31/1990 | 1 |
00036110 | 32 | 136 | 1/10/1991 | 1 |
00036110 | 479 | 136 | 5/2/1991 | 0 |
00036110 | 479 | 136 | 8/27/1991 | 0 |
00036110 | 479 | 136 | 9/12/1991 | 0 |
00036110 | 479 | 136 | 10/1/1991 | 0 |
00036110 | 479 | 136 | 10/31/1991 | 0 |
00036110 | 479 | 136 | 12/30/1991 | 0 |
00036110 | 479 | 136 | 4/2/1992 | 0 |
00036110 | 479 | 136 | 7/1/1992 | 0 |
00036110 | 479 | 136 | 12/21/1992 | 0 |
00036110 | 479 | 136 | 1/26/1993 | 0 |
00036110 | 165 | 278 | 9/8/1988 | 0 |
In this dataset, you can see that Analyst_id= 134474, who was working for the house 2143, stopped working (i.e., disappeared) as soon as the house 2143 closes (Span_closing=1, especially on 3/1/2017).
It is also the case with Analyst_id= 165909 working for the house 3945, and with Analyst_id= 194378 working for the house 926. You can simply leave such cases.
However, Analyst_id= 136, who used to work for the house 32, enters the final year span of the house 32 (Span_closing=1), and after the house closes, he continues to work for another house 479. For such an analyst like him, I'd like to create a new variable "Analyst_persist" that highlights his 1-year span (5/2/1991~4/2/1992) in which he works for a new house.
That is, I'd like to create a dataset like below:
cusip | House_id | Analyst_id | Annc_date | Span_closing | Analyst_persist |
00030710 | 2143 | 134474 | 2/23/2016 | 0 | 0 |
00030710 | 2143 | 134474 | 3/18/2016 | 0 | 0 |
00030710 | 2143 | 134474 | 8/5/2016 | 1 | 0 |
00030710 | 2143 | 134474 | 11/4/2016 | 1 | 0 |
00030710 | 2143 | 134474 | 2/27/2017 | 1 | 0 |
00030710 | 2143 | 134474 | 3/1/2017 | 1 | 0 |
00030710 | 3945 | 165909 | 10/23/2015 | 1 | 0 |
00030710 | 3945 | 165909 | 2/5/2016 | 1 | 0 |
00030710 | 3945 | 165909 | 2/24/2016 | 1 | 0 |
00032Q10 | 926 | 194378 | 8/30/2021 | 1 | 0 |
00032Q10 | 926 | 194378 | 11/11/2021 | 1 | 0 |
00036020 | 371 | 1112 | 3/22/1994 | 0 | 0 |
00036020 | 371 | 1112 | 6/7/1994 | 0 | 0 |
00036020 | 371 | 1112 | 11/21/1994 | 0 | 0 |
00036110 | 32 | 136 | 6/29/1989 | 0 | 0 |
00036110 | 32 | 136 | 1/4/1990 | 0 | 0 |
00036110 | 32 | 136 | 3/29/1990 | 1 | 0 |
00036110 | 32 | 136 | 5/10/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/2/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/15/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/30/1990 | 1 | 0 |
00036110 | 32 | 136 | 10/24/1990 | 1 | 0 |
00036110 | 32 | 136 | 10/31/1990 | 1 | 0 |
00036110 | 32 | 136 | 1/10/1991 | 1 | 0 |
00036110 | 479 | 136 | 5/2/1991 | 0 | 1 |
00036110 | 479 | 136 | 8/27/1991 | 0 | 1 |
00036110 | 479 | 136 | 9/12/1991 | 0 | 1 |
00036110 | 479 | 136 | 10/1/1991 | 0 | 1 |
00036110 | 479 | 136 | 10/31/1991 | 0 | 1 |
00036110 | 479 | 136 | 12/30/1991 | 0 | 1 |
00036110 | 479 | 136 | 4/2/1992 | 0 | 1 |
00036110 | 479 | 136 | 7/1/1992 | 0 | 0 |
00036110 | 479 | 136 | 12/21/1992 | 0 | 0 |
00036110 | 479 | 136 | 1/26/1993 | 0 | 0 |
00036110 | 165 | 278 | 9/8/1988 | 0 | 0 |
I tried to make a little modification of your previous codes and achieve this, but, unfortunately, my understanding of the sas function intnx is very limited and I couldn't do it.
I'd really appreciate it, if you give me a hand one more time!
Sincerely,
KS -,
data have;
input House_id Annc_dates :mmddyy10. House_close;
format Annc_dates mmddyy10.;
datalines;
1 7/1/1996 0
1 8/21/1996 0
1 9/4/1996 0
1 10/1/1996 0
1 11/1/1996 0
1 12/26/1996 0
1 1/28/1997 0
1 2/24/1997 0
1 3/21/1997 0
1 4/29/1997 0
1 5/15/1997 0
1 8/14/1997 0
1 11/3/1997 1
5 4/20/2000 0
5 5/12/2000 0
5 6/5/2000 0
5 10/12/2000 0
5 11/9/2000 0
5 12/14/2000 0
5 1/26/2001 0
5 2/27/2001 0
5 3/6/2001 0
5 5/23/2001 0
5 7/31/2001 0
5 9/11/2001 1
6 11/11/1987 0
6 12/3/1987 1
;
data want(drop = a);
do until (last.House_id);
set have;
by House_id;
if House_close = 1 then a = Annc_dates;
end;
do until (last.House_id);
set have;
by House_id;
Span_closing = (Annc_dates >= intnx('year', a, -1, 's'));
output;
end;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.