Hi, greetings!
I appreciate your presence, because you are always available whenever I need a help.
Thank you so much in advance, and the problem which is troubling me is the following:
I have a sort of brokerage houses' announcements data on a daily basis.
House_id | Annc_dates | House_close |
1 | 7/1/1996 | 0 |
1 | 8/21/1996 | 0 |
1 | 9/4/1996 | 0 |
1 | 10/1/1996 | 0 |
1 | 11/1/1996 | 0 |
1 | 12/26/1996 | 0 |
1 | 1/28/1997 | 0 |
1 | 2/24/1997 | 0 |
1 | 3/21/1997 | 0 |
1 | 4/29/1997 | 0 |
1 | 5/15/1997 | 0 |
1 | 8/14/1997 | 0 |
1 | 11/3/1997 | 1 |
…......... | ||
5 | 4/20/2000 | 0 |
5 | 5/12/2000 | 0 |
5 | 6/5/2000 | 0 |
5 | 10/12/2000 | 0 |
5 | 11/9/2000 | 0 |
5 | 12/14/2000 | 0 |
5 | 1/26/2001 | 0 |
5 | 2/27/2001 | 0 |
5 | 3/6/2001 | 0 |
5 | 5/23/2001 | 0 |
5 | 7/31/2001 | 0 |
5 | 9/11/2001 | 1 |
6 | 11/11/1987 | 0 |
6 | 12/3/1987 | 1 |
What I want to do, is to highlight one-year time span of observations before a house closes (that is, from the last date of a house in the data). For this, I created a variable "House_close" with 1 for the closing (last) dates, and 0 otherwise.
What I want to obtain is like the below:
House_id | Annc_dates | House_close | Span_closing |
1 | 7/1/1996 | 0 | 0 |
1 | 8/21/1996 | 0 | 0 |
1 | 9/4/1996 | 0 | 0 |
1 | 10/1/1996 | 0 | 0 |
1 | 11/1/1996 | 0 | 1 |
1 | 12/26/1996 | 0 | 1 |
1 | 1/28/1997 | 0 | 1 |
1 | 2/24/1997 | 0 | 1 |
1 | 3/21/1997 | 0 | 1 |
1 | 4/29/1997 | 0 | 1 |
1 | 5/15/1997 | 0 | 1 |
1 | 8/14/1997 | 0 | 1 |
1 | 11/3/1997 | 1 | 1 |
…......... | |||
5 | 4/20/2000 | 0 | 0 |
5 | 5/12/2000 | 0 | 0 |
5 | 6/5/2000 | 0 | 0 |
5 | 10/12/2000 | 0 | 1 |
5 | 11/9/2000 | 0 | 1 |
5 | 12/14/2000 | 0 | 1 |
5 | 1/26/2001 | 0 | 1 |
5 | 2/27/2001 | 0 | 1 |
5 | 3/6/2001 | 0 | 1 |
5 | 5/23/2001 | 0 | 1 |
5 | 7/31/2001 | 0 | 1 |
5 | 9/11/2001 | 1 | 1 |
6 | 11/11/1987 | 0 | 1 |
6 | 12/3/1987 | 1 | 1 |
As you see, I created a new variable "Span_closing." For House_id = 1, the closing date is 11/3/1997. So, tracking all the way back to one year ago, the closest date is 11/1/1996. Over that one-year time span, I entered Span_closing = 1, otherwise = 0.
In case of House_id = 6, although it has only two observations, I entered Span_closing = 1 for all its obs based on the indicated principle.
I ask for your wisdom!
Sincerely,
KS -,
data have;
input House_id Annc_dates :mmddyy10. House_close;
format Annc_dates mmddyy10.;
datalines;
1 7/1/1996 0
1 8/21/1996 0
1 9/4/1996 0
1 10/1/1996 0
1 11/1/1996 0
1 12/26/1996 0
1 1/28/1997 0
1 2/24/1997 0
1 3/21/1997 0
1 4/29/1997 0
1 5/15/1997 0
1 8/14/1997 0
1 11/3/1997 1
5 4/20/2000 0
5 5/12/2000 0
5 6/5/2000 0
5 10/12/2000 0
5 11/9/2000 0
5 12/14/2000 0
5 1/26/2001 0
5 2/27/2001 0
5 3/6/2001 0
5 5/23/2001 0
5 7/31/2001 0
5 9/11/2001 1
6 11/11/1987 0
6 12/3/1987 1
;
data want(drop = a);
do until (last.House_id);
set have;
by House_id;
if House_close = 1 then a = Annc_dates;
end;
do until (last.House_id);
set have;
by House_id;
Span_closing = (Annc_dates >= intnx('year', a, -1, 's'));
output;
end;
run;
Why do you want 11/1/1996 to be included? It is not part of the 1 year span?
It does.
Result
House_id Annc_dates House_close Span_closing 1 07/01/1996 0 0 1 08/21/1996 0 0 1 09/04/1996 0 0 1 10/01/1996 0 0 1 11/01/1996 0 0 1 12/26/1996 0 1 1 01/28/1997 0 1 1 02/24/1997 0 1 1 03/21/1997 0 1 1 04/29/1997 0 1 1 05/15/1997 0 1 1 08/14/1997 0 1 1 11/03/1997 1 1 5 04/20/2000 0 0 5 05/12/2000 0 0 5 06/05/2000 0 0 5 10/12/2000 0 1 5 11/09/2000 0 1 5 12/14/2000 0 1 5 01/26/2001 0 1 5 02/27/2001 0 1 5 03/06/2001 0 1 5 05/23/2001 0 1 5 07/31/2001 0 1 5 09/11/2001 1 1 6 11/11/1987 0 1 6 12/03/1987 1 1
Please, anyone from SAS community provide me a solution, if you are available.
Dear PeterClemmensen,
In relation to my previous task, I again bumped into an obstacle.
Now, I proc sorted my dataset by cusip(=firm id), Analyst_id, Annc_date, and Span_closing.
Span_closing is the final 1-year span before a house closes.
cusip | House_id | Analyst_id | Annc_date | Span_closing |
00030710 | 2143 | 134474 | 2/23/2016 | 0 |
00030710 | 2143 | 134474 | 3/18/2016 | 0 |
00030710 | 2143 | 134474 | 8/5/2016 | 1 |
00030710 | 2143 | 134474 | 11/4/2016 | 1 |
00030710 | 2143 | 134474 | 2/27/2017 | 1 |
00030710 | 2143 | 134474 | 3/1/2017 | 1 |
00030710 | 3945 | 165909 | 10/23/2015 | 1 |
00030710 | 3945 | 165909 | 2/5/2016 | 1 |
00030710 | 3945 | 165909 | 2/24/2016 | 1 |
00032Q10 | 926 | 194378 | 8/30/2021 | 1 |
00032Q10 | 926 | 194378 | 11/11/2021 | 1 |
00036020 | 371 | 1112 | 3/22/1994 | 0 |
00036020 | 371 | 1112 | 6/7/1994 | 0 |
00036020 | 371 | 1112 | 11/21/1994 | 0 |
00036110 | 32 | 136 | 6/29/1989 | 0 |
00036110 | 32 | 136 | 1/4/1990 | 0 |
00036110 | 32 | 136 | 3/29/1990 | 1 |
00036110 | 32 | 136 | 5/10/1990 | 1 |
00036110 | 32 | 136 | 8/2/1990 | 1 |
00036110 | 32 | 136 | 8/15/1990 | 1 |
00036110 | 32 | 136 | 8/30/1990 | 1 |
00036110 | 32 | 136 | 10/24/1990 | 1 |
00036110 | 32 | 136 | 10/31/1990 | 1 |
00036110 | 32 | 136 | 1/10/1991 | 1 |
00036110 | 479 | 136 | 5/2/1991 | 0 |
00036110 | 479 | 136 | 8/27/1991 | 0 |
00036110 | 479 | 136 | 9/12/1991 | 0 |
00036110 | 479 | 136 | 10/1/1991 | 0 |
00036110 | 479 | 136 | 10/31/1991 | 0 |
00036110 | 479 | 136 | 12/30/1991 | 0 |
00036110 | 479 | 136 | 4/2/1992 | 0 |
00036110 | 479 | 136 | 7/1/1992 | 0 |
00036110 | 479 | 136 | 12/21/1992 | 0 |
00036110 | 479 | 136 | 1/26/1993 | 0 |
00036110 | 165 | 278 | 9/8/1988 | 0 |
In this dataset, you can see that Analyst_id= 134474, who was working for the house 2143, stopped working (i.e., disappeared) as soon as the house 2143 closes (Span_closing=1, especially on 3/1/2017).
It is also the case with Analyst_id= 165909 working for the house 3945, and with Analyst_id= 194378 working for the house 926. You can simply leave such cases.
However, Analyst_id= 136, who used to work for the house 32, enters the final year span of the house 32 (Span_closing=1), and after the house closes, he continues to work for another house 479. For such an analyst like him, I'd like to create a new variable "Analyst_persist" that highlights his 1-year span (5/2/1991~4/2/1992) in which he works for a new house.
That is, I'd like to create a dataset like below:
cusip | House_id | Analyst_id | Annc_date | Span_closing | Analyst_persist |
00030710 | 2143 | 134474 | 2/23/2016 | 0 | 0 |
00030710 | 2143 | 134474 | 3/18/2016 | 0 | 0 |
00030710 | 2143 | 134474 | 8/5/2016 | 1 | 0 |
00030710 | 2143 | 134474 | 11/4/2016 | 1 | 0 |
00030710 | 2143 | 134474 | 2/27/2017 | 1 | 0 |
00030710 | 2143 | 134474 | 3/1/2017 | 1 | 0 |
00030710 | 3945 | 165909 | 10/23/2015 | 1 | 0 |
00030710 | 3945 | 165909 | 2/5/2016 | 1 | 0 |
00030710 | 3945 | 165909 | 2/24/2016 | 1 | 0 |
00032Q10 | 926 | 194378 | 8/30/2021 | 1 | 0 |
00032Q10 | 926 | 194378 | 11/11/2021 | 1 | 0 |
00036020 | 371 | 1112 | 3/22/1994 | 0 | 0 |
00036020 | 371 | 1112 | 6/7/1994 | 0 | 0 |
00036020 | 371 | 1112 | 11/21/1994 | 0 | 0 |
00036110 | 32 | 136 | 6/29/1989 | 0 | 0 |
00036110 | 32 | 136 | 1/4/1990 | 0 | 0 |
00036110 | 32 | 136 | 3/29/1990 | 1 | 0 |
00036110 | 32 | 136 | 5/10/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/2/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/15/1990 | 1 | 0 |
00036110 | 32 | 136 | 8/30/1990 | 1 | 0 |
00036110 | 32 | 136 | 10/24/1990 | 1 | 0 |
00036110 | 32 | 136 | 10/31/1990 | 1 | 0 |
00036110 | 32 | 136 | 1/10/1991 | 1 | 0 |
00036110 | 479 | 136 | 5/2/1991 | 0 | 1 |
00036110 | 479 | 136 | 8/27/1991 | 0 | 1 |
00036110 | 479 | 136 | 9/12/1991 | 0 | 1 |
00036110 | 479 | 136 | 10/1/1991 | 0 | 1 |
00036110 | 479 | 136 | 10/31/1991 | 0 | 1 |
00036110 | 479 | 136 | 12/30/1991 | 0 | 1 |
00036110 | 479 | 136 | 4/2/1992 | 0 | 1 |
00036110 | 479 | 136 | 7/1/1992 | 0 | 0 |
00036110 | 479 | 136 | 12/21/1992 | 0 | 0 |
00036110 | 479 | 136 | 1/26/1993 | 0 | 0 |
00036110 | 165 | 278 | 9/8/1988 | 0 | 0 |
I tried to make a little modification of your previous codes and achieve this, but, unfortunately, my understanding of the sas function intnx is very limited and I couldn't do it.
I'd really appreciate it, if you give me a hand one more time!
Sincerely,
KS -,
data have;
input House_id Annc_dates :mmddyy10. House_close;
format Annc_dates mmddyy10.;
datalines;
1 7/1/1996 0
1 8/21/1996 0
1 9/4/1996 0
1 10/1/1996 0
1 11/1/1996 0
1 12/26/1996 0
1 1/28/1997 0
1 2/24/1997 0
1 3/21/1997 0
1 4/29/1997 0
1 5/15/1997 0
1 8/14/1997 0
1 11/3/1997 1
5 4/20/2000 0
5 5/12/2000 0
5 6/5/2000 0
5 10/12/2000 0
5 11/9/2000 0
5 12/14/2000 0
5 1/26/2001 0
5 2/27/2001 0
5 3/6/2001 0
5 5/23/2001 0
5 7/31/2001 0
5 9/11/2001 1
6 11/11/1987 0
6 12/3/1987 1
;
data want(drop = a);
do until (last.House_id);
set have;
by House_id;
if House_close = 1 then a = Annc_dates;
end;
do until (last.House_id);
set have;
by House_id;
Span_closing = (Annc_dates >= intnx('year', a, -1, 's'));
output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.