BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7

Hello!
I am a new SAS user.

I have a time-series data which contains the return of stocks from all the companies from 01-Jan-2021 to 31-Dec-2021.
I have put a small sample of my data and the my expected output below for your kind consideration. The dataset is attached at the end.
Here is my sample data:

 

Company_ID Date Return Reporting_Date
1001 05JAN2021 0.03
1001 06JAN2021 0.09
1001 07JAN2021 0.05
1001 08JAN2021 0.06
1001 11JAN2021 0.07
1001 12JAN2021 0.08 12JAN2021
1001 13JAN2021 0.09
1001 14JAN2021 0.1
1001 15JAN2021 0.11
1001 18JAN2021 0.12
1001 19JAN2021 0.13
1001 02FEB2021 0.23
1001 03FEB2021 0.24
1001 04FEB2021 0.25
1001 05FEB2021 0.26
1001 08FEB2021 0.27
1001 09FEB2021 0.26 09FEB2021
1001 10FEB2021 0.29
1001 11FEB2021 0.3
1001 12FEB2021 0.31
1001 15FEB2021 0.32
1002 15FEB2021 0.39
1002 16FEB2021 0.34 19JAN2021
1002 17FEB2021 0.35
1002 18FEB2021 0.36
1002 21FEB2021 0.37
1002 22FEB2021 0.38
1003 23FEB2021 0.44
1003 24FEB2021 0.45
1003 25FEB2021 0.46
1003 26FEB2021 0.47 07JAN2021
1003 27FEB2021 0.48
1004 08JAN2021 0.03
1004 11JAN2021 0.06 04JAN2021
1004 12JAN2021 0.09
1004 13JAN2021 0.12
1004 14JAN2021 0.15

 

This is my expected output:

Company_IDReporting_DateDays_beforeDays_after
10012021-01-12515
10012021-02-09164
10022021-01-1914
10032021-01-0731
10042021-01-0413

 

Please note that the days cannot be simply calculated by considering them as continuous calendar days. As you will notice, these are trading days and the holidays are not included in the dataset.

Thank you in advance for your kind support!

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If your sample data were in the form of a working data step, I would test this program.  It isn't, so this is an untested response.

 

data want (drop=ndays i);

  /* Count the days for a given ID*/
  do ndays=1 by 1 until (last.company_id);
    set have;
    by company_id;
  end;

  /*Reread the same observations*/
  do i=1 to ndays;
    set have;
    if reporting_date^=. then do;
      days_before=i-1;
      days_after=ndays-i;
      output;
    end;
  end;
run;

Or applying the same logic more compactly:

 

data want (drop=ndays i);
  set have (in=firstpass)  have (in=secondpass);
  by company_id;
  if first.company_id then call missing(ndays,i);
  ndays+firstpass;

  i+secondpass;
  if secondpass and reporting_date^=.;
  days_before=i-1;
  days_after=ndays-i;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

If your sample data were in the form of a working data step, I would test this program.  It isn't, so this is an untested response.

 

data want (drop=ndays i);

  /* Count the days for a given ID*/
  do ndays=1 by 1 until (last.company_id);
    set have;
    by company_id;
  end;

  /*Reread the same observations*/
  do i=1 to ndays;
    set have;
    if reporting_date^=. then do;
      days_before=i-1;
      days_after=ndays-i;
      output;
    end;
  end;
run;

Or applying the same logic more compactly:

 

data want (drop=ndays i);
  set have (in=firstpass)  have (in=secondpass);
  by company_id;
  if first.company_id then call missing(ndays,i);
  ndays+firstpass;

  i+secondpass;
  if secondpass and reporting_date^=.;
  days_before=i-1;
  days_after=ndays-i;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mmh
Obsidian | Level 7 mmh
Obsidian | Level 7
Thank you so much for your time!
It was really helpful 🙂
Ksharp
Super User
proc sql;
create table want as
select Company_ID,Reporting_Date,
(select count(*) from have where Company_ID=a.Company_ID and Date<a.Date) as Days_before,
(select count(*) from have where Company_ID=a.Company_ID and Date>a.Date) as Days_after
 from have as a
  where Reporting_Date is not missing;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 489 views
  • 2 likes
  • 3 in conversation