BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAM_22
Fluorite | Level 6

Hello

I'm new to SAS and currently working on stock price data. Here is the example of my dataset:

 

Spoiler
DateStock1Stock2Stock3Stock4Stock5Stock6Stock7Stock8
01/06/2000314,11212,4916545098,29 125,51185
02/06/2000314,11212,4914342595,83 115,85185
05/06/2000338,27212,4913645098,29 123,1185
06/06/2000338,27212,49131450100,75 125,51200
07/06/2000314,11212,49133475100,75 135,16200
08/06/2000362,43212,49 525108,12 135,16172,5
09/06/2000338,27212,49 475103,2 139,99200
12/06/2000338,27212,49 500100,75 132,75200
13/06/2000314,11212,49 500100,75 130,34200
14/06/2000314,11212,49 500103,2 135,16200
15/06/2000314,11212,49 500103,2 135,16200
16/06/2000338,27212,49 500108,12 142,4200
19/06/2000362,43212,49 550108,12 147,23217,5
20/06/2000338,27212,49 525120,4 144,82217,5
21/06/2000362,43212,49 550125,32 149,65217,5
22/06/2000362,43212,49 600125,32 164,13200
23/06/2000338,27212,49 600122,86 161,71200
26/06/2000386,6212,49 575120,4 161,71210
27/06/2000338,27212,49 575117,95 161,71190
28/06/2000338,27212,49 575113,03 161,71190
29/06/2000338,27212,49 550113,03 164,13190
30/06/2000338,27212,49 575120,4 168,95190
03/07/2000338,27212,49 550112,05 160,75190
04/07/2000338,27212,49 565117,95 159,3190
05/07/2000347,94212,49 540123,85 159,3190
06/07/2000347,94212,49 555126,3 163,64180,5

What i want to do is to count non-missing values (trading days) every month for every variable (stock 1 to stock 😎 and create new dataset that keep variables with minimum 15 trading days every month, and remove those with less than 15 trading days every month. Based on the dataset above, i will keep variable 1,2,4,5,7, & 8, and drop variable 3 and 6.

 

Or maybe just tell me how to count the non-missing value every month and i will remove the variable manually. I have read some manuals but still couldn't find the solution. Any help/suggestion will be greatly appreciated. Thank you so much

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Well, you didn't tell us but I suspect your data set contains a variable indicating the date.  Let's use DATE for the variable name.

 

proc summary data=have nway;

class date;

format date monyy7.;

var stock1 - stock8;

output out=counts n=;

run;

 

That gets you halfway there.  The new data set COUNTS contains a single observation for each month, and the values for STOCK1 - STOCK8 are the number of nonmissing values found.  

 

Next:  find the minimum counts for any month.

 

proc summary data=counts;

var stock1-stock8;

output out=want min=;

run;

 

If you really have just 8 stocks, the simplest thing to do at this point would be to print the data set WANT and see which variables should be dropped.  By jumping through some hoops, you can automate the process.  Here's one way.

 

Next:  construct a list of values less than 15.

 

data _null_;

set want;

length droplist $ 200;

array stock {8};

do k=1 to 8;

   if stock{k} < 15 then droplist = catx(' ', droplist, vname(stock{k}));

end;

call symputx('droplist', droplist);

run;

 

Then drop the variables:

 

drop final_result;

set have (drop=&droplist);

run;

View solution in original post

5 REPLIES 5
Reeza
Super User

PROC MEANS with NMISS is all you need. You can apply a format to the date to have it handled as monthly data.

https://gist.github.com/statgeek/0cae5568752959b035516d6ac07a20fb

 


@SAM_22 wrote:

Hello

I'm new to SAS and currently working on stock price data. Here is the example of my dataset:

 

Spoiler
Date Stock1 Stock2 Stock3 Stock4 Stock5 Stock6 Stock7 Stock8
01/06/2000 314,11 212,49 165 450 98,29   125,51 185
02/06/2000 314,11 212,49 143 425 95,83   115,85 185
05/06/2000 338,27 212,49 136 450 98,29   123,1 185
06/06/2000 338,27 212,49 131 450 100,75   125,51 200
07/06/2000 314,11 212,49 133 475 100,75   135,16 200
08/06/2000 362,43 212,49   525 108,12   135,16 172,5
09/06/2000 338,27 212,49   475 103,2   139,99 200
12/06/2000 338,27 212,49   500 100,75   132,75 200
13/06/2000 314,11 212,49   500 100,75   130,34 200
14/06/2000 314,11 212,49   500 103,2   135,16 200
15/06/2000 314,11 212,49   500 103,2   135,16 200
16/06/2000 338,27 212,49   500 108,12   142,4 200
19/06/2000 362,43 212,49   550 108,12   147,23 217,5
20/06/2000 338,27 212,49   525 120,4   144,82 217,5
21/06/2000 362,43 212,49   550 125,32   149,65 217,5
22/06/2000 362,43 212,49   600 125,32   164,13 200
23/06/2000 338,27 212,49   600 122,86   161,71 200
26/06/2000 386,6 212,49   575 120,4   161,71 210
27/06/2000 338,27 212,49   575 117,95   161,71 190
28/06/2000 338,27 212,49   575 113,03   161,71 190
29/06/2000 338,27 212,49   550 113,03   164,13 190
30/06/2000 338,27 212,49   575 120,4   168,95 190
03/07/2000 338,27 212,49   550 112,05   160,75 190
04/07/2000 338,27 212,49   565 117,95   159,3 190
05/07/2000 347,94 212,49   540 123,85   159,3 190
06/07/2000 347,94 212,49   555 126,3   163,64 180,5

What i want to do is to count non-missing values (trading days) every month for every variable (stock 1 to stock 😎 and create new dataset that keep variables with minimum 15 trading days every month, and remove those with less than 15 trading days every month. Based on the dataset above, i will keep variable 1,2,4,5,7, & 8, and drop variable 3 and 6.

 

Or maybe just tell me how to count the non-missing value every month and i will remove the variable manually. I have read some manuals but still couldn't find the solution. Any help/suggestion will be greatly appreciated. Thank you so much


 

SAM_22
Fluorite | Level 6
Hi Reeza, I'll look into it, thank you!
novinosrin
Tourmaline | Level 20

Can you post samples of your dataset and the required output so we can play and test

Astounding
PROC Star

Well, you didn't tell us but I suspect your data set contains a variable indicating the date.  Let's use DATE for the variable name.

 

proc summary data=have nway;

class date;

format date monyy7.;

var stock1 - stock8;

output out=counts n=;

run;

 

That gets you halfway there.  The new data set COUNTS contains a single observation for each month, and the values for STOCK1 - STOCK8 are the number of nonmissing values found.  

 

Next:  find the minimum counts for any month.

 

proc summary data=counts;

var stock1-stock8;

output out=want min=;

run;

 

If you really have just 8 stocks, the simplest thing to do at this point would be to print the data set WANT and see which variables should be dropped.  By jumping through some hoops, you can automate the process.  Here's one way.

 

Next:  construct a list of values less than 15.

 

data _null_;

set want;

length droplist $ 200;

array stock {8};

do k=1 to 8;

   if stock{k} < 15 then droplist = catx(' ', droplist, vname(stock{k}));

end;

call symputx('droplist', droplist);

run;

 

Then drop the variables:

 

drop final_result;

set have (drop=&droplist);

run;

SAM_22
Fluorite | Level 6
It's actually hundreds of stocks, i just posted the sample as an illustration. I've followed your code and it does exactly what i need. Thank you so much!

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1241 views
  • 3 likes
  • 4 in conversation