Identify New Appearances: If a company's "RIC" is in this quarter but wasn't in the previous quarter, I want to mark it as 1; otherwise, it's 0. This helps me know when a company appears for the first time or reappears after some time.
Spot Disappearances: If a company's "RIC" is here in this quarter but disappears in the next quarter, I also want to mark it as 1; otherwise, it's 0. This helps me know when a company goes away.
Handling Comebacks After a Gap: If a company's "RIC" is there in one quarter, then it disappears for a couple of quarters, and then comes back, I want SAS to treat it as a "new" appearance. So, it should get a 1 to show that it's starting fresh.
I tried following code but its not working. It would be helpful, if you can help.
data data1;
set data1;
informat RIC $6.;
informat QuarterYear $6.;
by RIC QuarterYear;
retain last_quarter last_ric existed_in_current_quarter first_ric;
if last_quarter ne . and (QuarterYear ne intnx('quarter', last_quarter, 1) or missing(QuarterYear)) then do;
first_ric = .;
end;
if last_ric ne . and last_ric = RIC then existed_in_current_quarter = 0;
else existed_in_current_quarter = 1;
if first_ric = . and existed_in_current_quarter = 1 then first_ric = RIC;
output;
last_quarter = QuarterYear;
last_ric = RIC;
run;
@Fatima123456 wrote:
...
I tried following code but its not working. It would be helpful, if you can help.
"Not working" means what? The computer exploded?
And this
data data1;
set data1;
is a bad idea. Always.
Without seeing the log and the data, it is hardly possible to help you.
QuarterYear should be numeric with a date format attached displaying quarter and year only.
Something like this:
data want;
set data1;
by ric quarter;
prev_quarter=lag(quarter);
if first.quarter then first_quarter=1;
else first_quarter=0;
if not first.quarter then do;
if intck('quarter',prev_quarter,quarter) > 1 then gap=1;
else if intck('quarter',prev_quarter,quarter)=1 then gap=0;
end;
drop prev_quarter;
run;
But I am making some assumptions here. It would be much more helpful to provide a portion of the actual data (or even fake data) as working SAS data step code (examples and instructions) so we can know if the above code will indeed work. That way I wouldn't have to assume what types of values are contained in the variable QUARTER. It is always better if you provide a portion of your data (following the instructions) so I don't have to assume or guess, and possibly get it wrong. It would also be helpful if you provide more detail about this: "then it disappears for a couple of quarters" as I'm not really sure what you mean by "a couple of quarters", so I didn't try to program that part. (But whatever you mean, you should be able to modify the above code to achieve it, please give it a try). And agreeing with @andreas_lds , data 1; set data1; is a poor practice that will cause problems at some point in your programming.
Hi Paige, thank you for your prompt response. The issue pertains to the sample data. Specifically, the 'G' data series exhibits a jump from 2002Q4 to 2003Q3, with two missing quarters in between. I am now seeking a code solution capable of detecting and identifying such gaps. Consequently, I aim to generate a new column for a dummy variable that signifies when 'G' or any other ric first appeared. Even after a gap, it should be considered as the first occurrence. It's important to note that 'G' is just one example, and there are numerous similar cases in the dataset. I've spent several hours attempting to address this, including trying the code you provided, but I haven't achieved the desired results. I hope this additional information clarifies the situation, and I would greatly appreciate any further assistance you can offer.
Ric Quarteryear
JNJ 2010Q4
JNJ 2011Q1
IR 2011Q2
IR 2011Q3
IR 2011Q4
IR 2012Q1
IR 2012Q2
G 2002Q3
G 2002Q4
G 2003Q3
G 2003Q4
G 2004Q1
G 2004Q2
G 2004Q3
IRM 2004Q4
IRM 2005Q1
IRM 2005Q2
The reason I provide specific examples and instructions for providing data is so that I can use to the SAS code that you provide to re-create (a portion of) your exact data. But you didn't provide the data that way. Without that, I need to specifically ask: is the variable QUARTER numeric or character (text)? How is it formatted? Please provide both according to PROC CONTENTS.
In addition, in the code I did provide, you get a flag if there is a gap, and a flag for first occurrence, doesn't that meet your needs here?
First things first.
Do not (as in never) record dates as character variables. Even if you only have year and month (or year and quarter in your case), record them as two numeric variables. And if you also have day-of-month, then make a SAS date value (as I have done in creating variable END_OF_QUARTER in data set view NEED below by assuming last-day-of-month for every record):
dm 'clear log';
data have;
input Ric $3. Quarteryear :$6. ;
datalines;
JNJ 2010Q4
JNJ 2011Q1
IR 2011Q2
IR 2011Q3
IR 2011Q4
IR 2012Q1
IR 2012Q2
G 2002Q3
G 2002Q4
G 2003Q3
G 2003Q4
G 2004Q1
G 2004Q2
G 2004Q3
IRM 2004Q4
IRM 2005Q1
IRM 2005Q2
run;
data need (drop=_:)/view=need;
set have;
_year=input(quarteryear,4.);
_month=3*input(char(quarteryear,6),1.);
end_of_quarter=mdy(_month,1,_year);
end_of_quarter=intnx('month',end_of_quarter,0,'end');
format end_of_quarter date9. ;
run;
data want (drop=_:);
set need (keep=ric);
by ric notsorted;
merge need
need (firstobs=2 keep=end_of_quarter rename=(end_of_quarter=nxt_end_of_quarter));
_prev_gap_size=intck('qtr',lag(end_of_quarter),end_of_quarter)-1;
_next_gap_size=intck('qtr',end_of_quarter,nxt_end_of_quarter)-1;
new_appear = (first.ric=1) or (_prev_gap_size>=1);
spot_disappear = (last.ric=1) or (_nxt_gap_size>=1);
comeback = (first.ric=0) and (_prev_gap_size>=2);
run;
The real logic here is to use INTCK function to calculate the number of quarters from the LAGged and current values of END_OF_QUARTER - to record appearances or comebacks. And use the MERGE statement (notice the FIRSTOBS=2 parameter) to make available the value for the upcoming END_OF_QUARTER, to compare to the current value to identify upcoming gaps.
If INTCK generates a value of 1 (for a 1-quarter interval), then the gap is zero quarters.
I don't see any mention of a variable that indicates a specific company. Is this mythical (to me at least) RIC unique to a specific company? Exactly one RIC is associated with a company? If the answer to either of those is "NO" then your data is possibly insufficient to answer this question properly.
I suspect that RIC is an id for a Regulated Investment Company.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.