Hi Experts,
The following is my sample dataset. First column contains the Date and other columns are firms. I have just included only four firms here; I have many more firms in my original dataset. If you see, observations are zeros for almost all firms in specific dates (for example, 01/01/2004, 19/01/2004 etc.). I need to remove those rows from my dataset. How can I do that?
By the by, why can't I upload the SAS datafile here?
Abu
Date CH00186667 US00208J10 US01167P10 BMG89998210
01/01/2004 | 0 | 0 | 0 | 0 |
02/01/2004 | -0.014333754 | 0.020852691 | 0.005923507 | -0.026891625 |
05/01/2004 | -0.06504876 | -0.027125565 | -0.063793667 | -0.046109507 |
06/01/2004 | 0.134499902 | 0.012617197 | 0.009120535 | -0.048214012 |
07/01/2004 | 0.041180928 | -0.008999022 | -0.02407944 | -0.02589243 |
08/01/2004 | 0.083399573 | -0.032067022 | -0.025253911 | -0.094661522 |
09/01/2004 | 0.023332755 | 0.01515597 | 0.002682908 | 0.090551284 |
12/01/2004 | -0.010596278 | -0.004479334 | -0.046272924 | -0.016743585 |
13/01/2004 | 0.081842205 | -0.047210248 | -0.023050375 | -0.032730347 |
14/01/2004 | 0.240379787 | -0.028102609 | -0.019063044 | 0.028377054 |
15/01/2004 | 0.055129156 | -0.072989994 | -0.037508157 | -0.113781919 |
16/01/2004 | -0.107218991 | 0.010076065 | 0.017395467 | 0.013286573 |
19/01/2004 | 0 | 0 | 0 | 0 |
20/01/2004 | -0.102122217 | -0.033188884 | -0.024621075 | -0.064291383 |
21/01/2004 | 0.006716838 | -0.0031347 | 0.011238732 | -0.106147952 |
22/01/2004 | 0.022895808 | -0.036231344 | -0.001034744 | 0.15486874 |
23/01/2004 | 0.001121589 | 0.104480825 | 0.023919155 | 0.002775815 |
26/01/2004 | -0.035955148 | -0.003662896 | -0.064707752 | 0.008888869 |
27/01/2004 | 0.047943602 | 0.029589669 | 0.040730237 | 0.047567453 |
28/01/2004 | 0.024591751 | -0.032891828 | 0.015808341 | -0.036130603 |
29/01/2004 | -0.020291339 | -0.08556701 | -0.007250813 | 0.006551547 |
30/01/2004 | -0.021568842 | 0.013072352 | -0.014142346 | 0.042799423 |
02/02/2004 | -0.093161236 | -0.03704503 | 0.004789916 | -0.000658193 |
03/02/2004 | -0.036070915 | -0.010547938 | 0.017937674 | 0.040776061 |
04/02/2004 | -0.002778098 | -0.019917182 | 0.024302875 | 0.001851073 |
05/02/2004 | 0.269316448 | -0.057050738 | -0.00705107 | -0.077396759 |
06/02/2004 | -0.153726751 | 0.035610327 | -0.035162606 | 0.002378226 |
09/02/2004 | 0.144070704 | 0.027980127 | -0.009166081 | 0.034578842 |
10/02/2004 | -0.03454791 | 0.110229835 | -0.00011394 | -0.084497257 |
11/02/2004 | -0.031655654 | -0.024520682 | -0.032001739 | 0.046010137 |
12/02/2004 | -0.035674229 | -0.041773807 | -0.017239947 | 0.048189697 |
13/02/2004 | -0.035026192 | 0.018332305 | 0.001640581 | -0.008376772 |
16/02/2004 | 0 | 0 | 0 | -0.019097575 |
17/02/2004 | 0.001044875 | -0.039156187 | -0.022763521 | -0.046791297 |
18/02/2004 | 0.044619367 | 0.020950882 | -0.034597169 | 0.034137212 |
19/02/2004 | 0.004683994 | 0.00961849 | 0.034129835 | 0.081376146 |
20/02/2004 | 0.012887259 | 0.008957072 | 0.014299227 | -0.0762263 |
So why don't you just use something like:
data want;
set have;
if (min(of CH00186667--BMG89998210) = max(of CH00186667--BMG89998210)=0)
or date eq '16FEB2004'd
then delete;
run;
Do you want to delete the observations where all firms have a value of 0?
if min(of CH00186667 US00208J10 US01167P10 BMG89998210) = max(of CH00186667 US00208J10 US01167P10 BMG89998210)=0 then delete;
Same suggestion as Tom, but you can use variable lists. e.g.:
set have;
if min(of CH00186667--BMG89998210) = max(of CH00186667--BMG89998210)=0
then delete;
run;
The code works but two concerns again. First, in some Dates I see that the observations are zeros not for all firms; some firms have other values. I also need to remove those. Second, if I want to remove just one or two Dates separately, what can I do?
You may be better or re-shaping your data then. E.g.:
DATE COMPANY RESULT
01/01/04 CH00186667 0
01/01/04 US00208J10 0
...
You can then runs sums and such like to evaluate if all in date are 0, or specific checks across companies etc.
If you look at the 16/02/2004, the values are zeros for three firms. Your codes do not delete this row; I also need to remove this row. How can I do that?
Again, say I also want to delete 11/02/2004 although in this Date all firms have values? How can I do that?
Date CH00186667 US00208J10 US01167P10 BMG89998210
11/02/2004 | -0.031655654 | -0.024520682 | -0.032001739 | 0.046010137 |
12/02/2004 | -0.035674229 | -0.041773807 | -0.017239947 | 0.048189697 |
13/02/2004 | -0.035026192 | 0.018332305 | 0.001640581 | -0.008376772 |
16/02/2004 | 0 | 0 | 0 | -0.019097575 |
17/02/2004 | 0.001044875 | -0.039156187 | -0.022763521 | -0.046791297 |
What code did you try? What happened?
I have tried the code made by Arthur Tabachneck. He modified your code. If I use that code, then those dates are deleted where all firms have 0 values. But for example, all firms do not have 0 values in 16/02/2004. By using Arthur's code, this 16/02/2004 is not deleting; but I want to delete this date as well. Again, although no firm has 0 values in 11/02/2004, I would like to delete this date as well.
Date CH00186667 US00208J10 US01167P10 BMG89998210
11/02/2004 | -0.031655654 | -0.024520682 | -0.032001739 | 0.046010137 |
12/02/2004 | -0.035674229 | -0.041773807 | -0.017239947 | 0.048189697 |
13/02/2004 | -0.035026192 | 0.018332305 | 0.001640581 | -0.008376772 |
16/02/2004 | 0 | 0 | 0 | -0.019097575 |
17/02/2004 | 0.001044875 | -0.039156187 | -0.022763521 | -0.046791297 |
So how did you try to modify it to meet your new conditions?
Try re-stating the problem and see what SAS syntax for such a statement would be. Then run it and check your results.
Let me get you started.
English: Delete rows where DATE is in my list of specific dates.
How would you translate that into a SAS statement?
English: Delete rows where at least one of the company variables has a value of zero.
How would you translate that into a SAS statement?
The sample code works as planned, but the plan is to remove rows with all zeros. If you want to delete rows with "almost all" zeros, you will have to tell us what that means.
That means, I would like to delete those dates when the values of all firms are 0 and when values of some firms are 0. I would also like to delete any date (by my choice) as I explained in my reply to Tom.
So why don't you just use something like:
data want;
set have;
if (min(of CH00186667--BMG89998210) = max(of CH00186667--BMG89998210)=0)
or date eq '16FEB2004'd
then delete;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.