BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi SAS community,

Today I faced a problem when trying to exclude the non-trading days from my dataset.

My dataset is described as below:

Type date R

131712	01JAN1987	.
131712	02JAN1987	40
131712	05JAN1987	80
131712	06JAN1987	60
131712	07JAN1987	.
.
.
131712  27DEC2019	.
131712  30DEC2019	.
131712  31DEC2019	.
.
.
28829X  01JAN1987   121.32
28829X  02JAN1987   32.37
28829X  05JAN1987   115.19
.
.
28829X  27DEC2019	.
28829X  30DEC2019	.
28829X  31DEC2019	.
.
.

The first column is the label of the companies (character type) (I have around 10,000 stocks sorted ascendingly), the second column is about the date (ddmmyyyy9.) (daily data from 1/1/1987 to 31/12/2019), and the third column is a numeric variable named "return index" (R)accordingly.

 

My concern here is that: a day on which more than 90% of stocks have zero returns is non-trading days, and then I need to exclude these non-trading days.

Many thanks in advance.

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
data have;
infile cards truncover;
informat ID $8. date date9.;
format date date9.;
input ID  date   value;
cards;
131712 01JAN1987	.
131712 02JAN1987	40
131712 05JAN1987	80
131712 06JAN1987	60
131712 07JAN1987	.
131712 27DEC2019	.
131712 30DEC2019	.
131712 31DEC2019	.
28829X 01JAN1987   121.32
28829X 02JAN1987   32.37
28829X 05JAN1987   115.19
28829X 27DEC2019	.
28829X 30DEC2019	.
28829X 31DEC2019	.
131713 01JAN1987	.
131714 01JAN1987	.
;;;;
run;

*count missing and not missing;
proc means data=have noprint nway;
class date;
var value;
output out=nMissing NMISS=NMISS;
run;

*convert to a percent;
data pctMissing;
set nMissing;
PCT_MISS = NMISS/_freq_;
run;

*get list of dates where percent missing is over 90%;
proc sql;
create table filteredData as
select * 
from have where date not in (select date from pctMissing where PCT_MISS > 0.5);

quit;

/*slightly more efficient way to do the means and pctMissing data set
*creates a view with a flag as 0/1 for missing which can then be used in proc means to calculate the percentage*/
data temp / view=temp;
set have;

if missing(value) then flag = 1; else flag=0;
run;

proc means data=temp noprint nway;
class date;
var flag;
output out=pctMissing mean=PCT_MISS;
run;

*get list of dates where percent missing is over 90%;
proc sql;
create table filteredData as
select * 
from have where date not in (select date from pctMissing where PCT_MISS > 0.5);

quit;

View solution in original post

5 REPLIES 5
Reeza
Super User
Can you show what you've tried so far?
Phil_NZ
Barite | Level 11

Hi @Reeza , thank you for your reply,

I am not sure about how to code for this question, but from the idea, I postulate one possible solution for this concern is to create a new column (obs_per_day_0) and count through the column R (+1 when R =0). Then after that, we compare obs_per_day_0 with the total number of companies in my dataset. If the number in each day of the column obs_per_day_0 is higher than 90% of the total number, I will exclude this day from the sample.

 

Many thanks.

 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Reeza
Super User
data have;
infile cards truncover;
informat ID $8. date date9.;
format date date9.;
input ID  date   value;
cards;
131712 01JAN1987	.
131712 02JAN1987	40
131712 05JAN1987	80
131712 06JAN1987	60
131712 07JAN1987	.
131712 27DEC2019	.
131712 30DEC2019	.
131712 31DEC2019	.
28829X 01JAN1987   121.32
28829X 02JAN1987   32.37
28829X 05JAN1987   115.19
28829X 27DEC2019	.
28829X 30DEC2019	.
28829X 31DEC2019	.
131713 01JAN1987	.
131714 01JAN1987	.
;;;;
run;

*count missing and not missing;
proc means data=have noprint nway;
class date;
var value;
output out=nMissing NMISS=NMISS;
run;

*convert to a percent;
data pctMissing;
set nMissing;
PCT_MISS = NMISS/_freq_;
run;

*get list of dates where percent missing is over 90%;
proc sql;
create table filteredData as
select * 
from have where date not in (select date from pctMissing where PCT_MISS > 0.5);

quit;

/*slightly more efficient way to do the means and pctMissing data set
*creates a view with a flag as 0/1 for missing which can then be used in proc means to calculate the percentage*/
data temp / view=temp;
set have;

if missing(value) then flag = 1; else flag=0;
run;

proc means data=temp noprint nway;
class date;
var flag;
output out=pctMissing mean=PCT_MISS;
run;

*get list of dates where percent missing is over 90%;
proc sql;
create table filteredData as
select * 
from have where date not in (select date from pctMissing where PCT_MISS > 0.5);

quit;
Phil_NZ
Barite | Level 11

Thank you @Reeza , it works successfully to me.

 

I learn quite a bit from your solution and your "more efficient" way to get there.

 
 
Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Shmuel
Garnet | Level 18
data want;
 set have;
     if not missing(R);
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 703 views
  • 1 like
  • 3 in conversation