Hello
I'm new to SAS and currently working on stock price data. Here is the example of my dataset:
Date | Stock1 | Stock2 | Stock3 | Stock4 | Stock5 | Stock6 | Stock7 | Stock8 |
01/06/2000 | 314,11 | 212,49 | 165 | 450 | 98,29 | 125,51 | 185 | |
02/06/2000 | 314,11 | 212,49 | 143 | 425 | 95,83 | 115,85 | 185 | |
05/06/2000 | 338,27 | 212,49 | 136 | 450 | 98,29 | 123,1 | 185 | |
06/06/2000 | 338,27 | 212,49 | 131 | 450 | 100,75 | 125,51 | 200 | |
07/06/2000 | 314,11 | 212,49 | 133 | 475 | 100,75 | 135,16 | 200 | |
08/06/2000 | 362,43 | 212,49 | 525 | 108,12 | 135,16 | 172,5 | ||
09/06/2000 | 338,27 | 212,49 | 475 | 103,2 | 139,99 | 200 | ||
12/06/2000 | 338,27 | 212,49 | 500 | 100,75 | 132,75 | 200 | ||
13/06/2000 | 314,11 | 212,49 | 500 | 100,75 | 130,34 | 200 | ||
14/06/2000 | 314,11 | 212,49 | 500 | 103,2 | 135,16 | 200 | ||
15/06/2000 | 314,11 | 212,49 | 500 | 103,2 | 135,16 | 200 | ||
16/06/2000 | 338,27 | 212,49 | 500 | 108,12 | 142,4 | 200 | ||
19/06/2000 | 362,43 | 212,49 | 550 | 108,12 | 147,23 | 217,5 | ||
20/06/2000 | 338,27 | 212,49 | 525 | 120,4 | 144,82 | 217,5 | ||
21/06/2000 | 362,43 | 212,49 | 550 | 125,32 | 149,65 | 217,5 | ||
22/06/2000 | 362,43 | 212,49 | 600 | 125,32 | 164,13 | 200 | ||
23/06/2000 | 338,27 | 212,49 | 600 | 122,86 | 161,71 | 200 | ||
26/06/2000 | 386,6 | 212,49 | 575 | 120,4 | 161,71 | 210 | ||
27/06/2000 | 338,27 | 212,49 | 575 | 117,95 | 161,71 | 190 | ||
28/06/2000 | 338,27 | 212,49 | 575 | 113,03 | 161,71 | 190 | ||
29/06/2000 | 338,27 | 212,49 | 550 | 113,03 | 164,13 | 190 | ||
30/06/2000 | 338,27 | 212,49 | 575 | 120,4 | 168,95 | 190 | ||
03/07/2000 | 338,27 | 212,49 | 550 | 112,05 | 160,75 | 190 | ||
04/07/2000 | 338,27 | 212,49 | 565 | 117,95 | 159,3 | 190 | ||
05/07/2000 | 347,94 | 212,49 | 540 | 123,85 | 159,3 | 190 | ||
06/07/2000 | 347,94 | 212,49 | 555 | 126,3 | 163,64 | 180,5 |
What i want to do is to count non-missing values (trading days) every month for every variable (stock 1 to stock 😎 and create new dataset that keep variables with minimum 15 trading days every month, and remove those with less than 15 trading days every month. Based on the dataset above, i will keep variable 1,2,4,5,7, & 8, and drop variable 3 and 6.
Or maybe just tell me how to count the non-missing value every month and i will remove the variable manually. I have read some manuals but still couldn't find the solution. Any help/suggestion will be greatly appreciated. Thank you so much
Well, you didn't tell us but I suspect your data set contains a variable indicating the date. Let's use DATE for the variable name.
proc summary data=have nway;
class date;
format date monyy7.;
var stock1 - stock8;
output out=counts n=;
run;
That gets you halfway there. The new data set COUNTS contains a single observation for each month, and the values for STOCK1 - STOCK8 are the number of nonmissing values found.
Next: find the minimum counts for any month.
proc summary data=counts;
var stock1-stock8;
output out=want min=;
run;
If you really have just 8 stocks, the simplest thing to do at this point would be to print the data set WANT and see which variables should be dropped. By jumping through some hoops, you can automate the process. Here's one way.
Next: construct a list of values less than 15.
data _null_;
set want;
length droplist $ 200;
array stock {8};
do k=1 to 8;
if stock{k} < 15 then droplist = catx(' ', droplist, vname(stock{k}));
end;
call symputx('droplist', droplist);
run;
Then drop the variables:
drop final_result;
set have (drop=&droplist);
run;
PROC MEANS with NMISS is all you need. You can apply a format to the date to have it handled as monthly data.
https://gist.github.com/statgeek/0cae5568752959b035516d6ac07a20fb
@SAM_22 wrote:
Hello
I'm new to SAS and currently working on stock price data. Here is the example of my dataset:
Spoiler
Date Stock1 Stock2 Stock3 Stock4 Stock5 Stock6 Stock7 Stock8 01/06/2000 314,11 212,49 165 450 98,29 125,51 185 02/06/2000 314,11 212,49 143 425 95,83 115,85 185 05/06/2000 338,27 212,49 136 450 98,29 123,1 185 06/06/2000 338,27 212,49 131 450 100,75 125,51 200 07/06/2000 314,11 212,49 133 475 100,75 135,16 200 08/06/2000 362,43 212,49 525 108,12 135,16 172,5 09/06/2000 338,27 212,49 475 103,2 139,99 200 12/06/2000 338,27 212,49 500 100,75 132,75 200 13/06/2000 314,11 212,49 500 100,75 130,34 200 14/06/2000 314,11 212,49 500 103,2 135,16 200 15/06/2000 314,11 212,49 500 103,2 135,16 200 16/06/2000 338,27 212,49 500 108,12 142,4 200 19/06/2000 362,43 212,49 550 108,12 147,23 217,5 20/06/2000 338,27 212,49 525 120,4 144,82 217,5 21/06/2000 362,43 212,49 550 125,32 149,65 217,5 22/06/2000 362,43 212,49 600 125,32 164,13 200 23/06/2000 338,27 212,49 600 122,86 161,71 200 26/06/2000 386,6 212,49 575 120,4 161,71 210 27/06/2000 338,27 212,49 575 117,95 161,71 190 28/06/2000 338,27 212,49 575 113,03 161,71 190 29/06/2000 338,27 212,49 550 113,03 164,13 190 30/06/2000 338,27 212,49 575 120,4 168,95 190 03/07/2000 338,27 212,49 550 112,05 160,75 190 04/07/2000 338,27 212,49 565 117,95 159,3 190 05/07/2000 347,94 212,49 540 123,85 159,3 190 06/07/2000 347,94 212,49 555 126,3 163,64 180,5 What i want to do is to count non-missing values (trading days) every month for every variable (stock 1 to stock 😎 and create new dataset that keep variables with minimum 15 trading days every month, and remove those with less than 15 trading days every month. Based on the dataset above, i will keep variable 1,2,4,5,7, & 8, and drop variable 3 and 6.
Or maybe just tell me how to count the non-missing value every month and i will remove the variable manually. I have read some manuals but still couldn't find the solution. Any help/suggestion will be greatly appreciated. Thank you so much
Can you post samples of your dataset and the required output so we can play and test
Well, you didn't tell us but I suspect your data set contains a variable indicating the date. Let's use DATE for the variable name.
proc summary data=have nway;
class date;
format date monyy7.;
var stock1 - stock8;
output out=counts n=;
run;
That gets you halfway there. The new data set COUNTS contains a single observation for each month, and the values for STOCK1 - STOCK8 are the number of nonmissing values found.
Next: find the minimum counts for any month.
proc summary data=counts;
var stock1-stock8;
output out=want min=;
run;
If you really have just 8 stocks, the simplest thing to do at this point would be to print the data set WANT and see which variables should be dropped. By jumping through some hoops, you can automate the process. Here's one way.
Next: construct a list of values less than 15.
data _null_;
set want;
length droplist $ 200;
array stock {8};
do k=1 to 8;
if stock{k} < 15 then droplist = catx(' ', droplist, vname(stock{k}));
end;
call symputx('droplist', droplist);
run;
Then drop the variables:
drop final_result;
set have (drop=&droplist);
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.