BookmarkSubscribeRSS Feed
Fatima123456
Calcite | Level 5

Identify New Appearances: If a company's "RIC" is in this quarter but wasn't in the previous quarter, I want to mark it as 1; otherwise, it's 0. This helps me know when a company appears for the first time or reappears after some time.
Spot Disappearances: If a company's "RIC" is here in this quarter but disappears in the next quarter, I also want to mark it as 1; otherwise, it's 0. This helps me know when a company goes away.
Handling Comebacks After a Gap: If a company's "RIC" is there in one quarter, then it disappears for a couple of quarters, and then comes back, I want SAS to treat it as a "new" appearance. So, it should get a 1 to show that it's starting fresh.
I tried following code but its not working. It would be helpful, if you can help.

 
data data1;
set data1;


informat RIC $6.;
informat QuarterYear $6.;


by RIC QuarterYear;

retain last_quarter last_ric existed_in_current_quarter first_ric;


if last_quarter ne . and (QuarterYear ne intnx('quarter', last_quarter, 1) or missing(QuarterYear)) then do;
first_ric = .;
end;


if last_ric ne . and last_ric = RIC then existed_in_current_quarter = 0;
else existed_in_current_quarter = 1;

if first_ric = . and existed_in_current_quarter = 1 then first_ric = RIC;

output;

last_quarter = QuarterYear;
last_ric = RIC;
run; 

 

8 REPLIES 8
andreas_lds
Jade | Level 19

@Fatima123456 wrote:

...
I tried following code but its not working. It would be helpful, if you can help.

"Not working" means what? The computer exploded?

And this

data data1;
set data1;

 is a bad idea. Always.

 

Without seeing the log and the data, it is hardly possible to help you.

QuarterYear should be numeric with a date format attached displaying quarter and year only.

Fatima123456
Calcite | Level 5
data flag;
set F13;
by RIC;
retain prev_quarter;

if first.RIC then do;
prev_quarter = '';
new_appearance = 1;
disappearance = 0;
comeback_flag = 0;
end;

if Quarter ne prev_quarter then do;
new_appearance = 1;
disappearance = 0;
comeback_flag = 0;
end;
else do;
new_appearance = 0;
disappearance = 0;
end;

if last.RIC then do;
new_appearance = 0;
disappearance = 0;
comeback_flag = 0;
end;
else if new_appearance = 0 then do;
disappearance = 1;
comeback_flag = 0;
end;

if new_appearance = 1 then do;
comeback_flag = 1;
end;

output;
prev_quarter = Quarter;
run; Sorry for confusion. This is the code I am trying right now but its not giving right results. this command is not picking gaps in data. Quarteryear 2001Q1 2001Q1 2001 Q1 2001Q2 Ric AXP ASD ZNT G.
PaigeMiller
Diamond | Level 26

Something like this:

 

data want;
    set data1;
    by ric quarter;
    prev_quarter=lag(quarter);
    if first.quarter then first_quarter=1;
    else first_quarter=0;
    if not first.quarter then do;
        if intck('quarter',prev_quarter,quarter) > 1 then gap=1;
        else if intck('quarter',prev_quarter,quarter)=1 then gap=0;
    end;
    drop prev_quarter;
run;

 

But I am making some assumptions here. It would be much more helpful to provide a portion of the actual data (or even fake data) as working SAS data step code (examples and instructions) so we can know if the above code will indeed work. That way I wouldn't have to assume what types of values are contained in the variable QUARTER. It is always better if you provide a portion of your data (following the instructions) so I don't have to assume or guess, and possibly get it wrong. It would also be helpful if you provide more detail about this: "then it disappears for a couple of quarters" as I'm not really sure what you mean by "a couple of quarters", so I didn't try to program that part. (But whatever you mean, you should be able to modify the above code to achieve it, please give it a try). And agreeing with @andreas_lds , data 1; set data1; is a poor practice that will cause problems at some point in your programming.

--
Paige Miller
Fatima123456
Calcite | Level 5

Hi Paige, thank you for your prompt response. The issue pertains to the sample data. Specifically, the 'G' data series exhibits a jump from 2002Q4 to 2003Q3, with two missing quarters in between. I am now seeking a code solution capable of detecting and identifying such gaps. Consequently, I aim to generate a new column for a dummy variable that signifies when 'G' or any other ric first appeared. Even after a gap, it should be considered as the first occurrence. It's important to note that 'G' is just one example, and there are numerous similar cases in the dataset. I've spent several hours attempting to address this, including trying the code you provided, but I haven't achieved the desired results. I hope this additional information clarifies the situation, and I would greatly appreciate any further assistance you can offer.

Ric   Quarteryear
JNJ	2010Q4
JNJ	2011Q1
IR	2011Q2
IR	2011Q3
IR	2011Q4
IR	2012Q1
IR	2012Q2
G	2002Q3
G	2002Q4
G	2003Q3
G	2003Q4
G	2004Q1
G	2004Q2
G	2004Q3
IRM	2004Q4
IRM	2005Q1
IRM	2005Q2
PaigeMiller
Diamond | Level 26

The reason I provide specific examples and instructions for providing data is so that I can use to the SAS code that you provide to re-create (a portion of) your exact data. But you didn't provide the data that way. Without that, I need to specifically ask: is the variable QUARTER numeric or character (text)? How is it formatted? Please provide both according to PROC CONTENTS.

 

In addition, in the code I did provide, you get a flag if there is a gap, and a flag for first occurrence, doesn't that meet your needs here?

--
Paige Miller
mkeintz
PROC Star

First things first.

 

Do not  (as in never) record dates as character variables.  Even if you only have year and month (or year and quarter in your case), record them as two numeric variables.  And if you also have day-of-month, then make a SAS date value (as I have done in creating variable END_OF_QUARTER in data set view NEED below by assuming last-day-of-month for every record):

 

dm 'clear log';
data have;
  input Ric $3.  Quarteryear :$6. ;
datalines;
JNJ	2010Q4
JNJ	2011Q1
IR	2011Q2
IR	2011Q3
IR	2011Q4
IR	2012Q1
IR	2012Q2
G	2002Q3
G	2002Q4
G	2003Q3
G	2003Q4
G	2004Q1
G	2004Q2
G	2004Q3
IRM	2004Q4
IRM	2005Q1
IRM	2005Q2
run;

data need (drop=_:)/view=need;
  set have;
  _year=input(quarteryear,4.);
  _month=3*input(char(quarteryear,6),1.);
  end_of_quarter=mdy(_month,1,_year);
  end_of_quarter=intnx('month',end_of_quarter,0,'end');
  format end_of_quarter date9. ;
run;

data want (drop=_:);
  set need (keep=ric);
  by ric  notsorted;

  merge need
        need (firstobs=2 keep=end_of_quarter rename=(end_of_quarter=nxt_end_of_quarter));

  _prev_gap_size=intck('qtr',lag(end_of_quarter),end_of_quarter)-1;
  _next_gap_size=intck('qtr',end_of_quarter,nxt_end_of_quarter)-1;

  new_appear     = (first.ric=1) or  (_prev_gap_size>=1);
  spot_disappear = (last.ric=1)  or  (_nxt_gap_size>=1);
  comeback       = (first.ric=0) and (_prev_gap_size>=2);
run;

The real logic here is to use INTCK function to calculate the number of quarters from the LAGged and current values of END_OF_QUARTER - to record appearances or comebacks.  And use the MERGE statement (notice the FIRSTOBS=2 parameter) to make available the value for the upcoming END_OF_QUARTER, to compare to the current value to identify upcoming gaps.

 

If INTCK generates a value of 1 (for a 1-quarter interval), then the gap is zero quarters.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

I don't see any mention of a variable that indicates a specific company. Is this mythical (to me at least) RIC unique to a specific company? Exactly one RIC is associated with a company? If the answer to either of those is "NO" then your data is possibly insufficient to answer this question properly.

mkeintz
PROC Star

@ballardw 

 

I suspect that RIC is an id for a Regulated Investment Company.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 672 views
  • 0 likes
  • 5 in conversation