Help using Base SAS procedures

How do I selectively suppress data?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

How do I selectively suppress data?

I need to suppress data according to the following rules:

1. If any cell is less than five (includes 0), the cell is suppressed.

2. When there is suppression of a single cell, the next smallest cell(s) is suppressed until the total of the data in the suppressed cells adds up to at least five.

3. If any single category contains all of the data, the entire group of data is suppressed.

Here is sample data:

School                          Proficiency level            2011     2012     2013     2014

Washington HS                         1                      1               0          1               2

Washington HS                         2                      2               3          2               2

Washington HS                         3                      6               3          2               2

Washington HS                         4                      18              12         9               4

Monroe HS                               1                      10              12         4               6

Monroe HS                               2                      4               5          4               0

Monroe HS                               3                      8               7          9               5

Monroe HS                               4                      20              15         18         10

Truman HS                                1                      0               20         15         14

Truman HS                                2                      0               18         16         19

Truman HS                                3                      0               34         20         28

Truman HS                                4                      60              28         25         24

In this example, the data of concern are the 4 numbers within each year for each school.

So, for example, for Washington HS for 2011, there are 4 numbers: 1, 2, 6, 18.  As 1 and 2 are both less than 5, they will need to be suppressed.   Since
the total of 3 ( 1 + 2) is less than 5, the 6 will also need to be suppressed.

Another example: Washington HS for 2012, there are 4 numbers: 0, 3, 3, 12.  The first 3 will need to be suppressed as they are all less than 5.  Since the total of these 3 numbers is 6, we keep the 12.

Another example: Washington HS for 2014, there are 4 numbers: 2, 2, 2, 4.  Since all are under 5, all will need to be suppressed.

   

Another example: Truman HS for 2011, there are 4 numbers: 0, 0, 0, 60.  Since all of the data are in one cell, all cells must be suppressed.

   

Can this be achieved in SAS?
Thank you.


Accepted Solutions
Solution
‎10-22-2014 03:47 PM
Respected Advisor
Posts: 3,124

Re: How do I selectively suppress data?

Not sure if this is what you need, but take a look,

Data Have;

     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);

     cards;

Washington HS                         1                      1               0          1               2

Washington HS                         2                      2               3          2               2

Washington HS                         3                      6               3          2               2

Washington HS                         4                      18              12         9               4

Monroe HS                               1                      10              12         4               6

Monroe HS                               2                      4               5          4               0

Monroe HS                               3                      8               7          9               5

Monroe HS                               4                      20              15         18         10

Truman HS                                1                      0               20         15         14

Truman HS                                2                      0               18         16         19

Truman HS                                3                      0               34         20         28

Truman HS                                4                      60              28         25         24

;

Run;

data want;

     do until (last.school);

           set have;

           by school notsorted;

           array y(4) y2011-y2014;

           array t(4) _temporary_;

           array m(4) _temporary_;

           do i=1 to 4;

                if y(i) <5 then

                     t(i)+y(i);

                else m(i)=min(m(i),y(i));

           end;

     end;

     do until (last.school);

           set have;

           by school notsorted;

           do i=1 to 4;

                if y(i) <5 then

                     y(i)=-1*y(i);

                else if y(i)=m(i) and .< t(i) < 5 then

                     y(i)=-1*y(i);

           end;

           output;

     end;

     call missing (of m(*), of t(*));

     drop i;

run;

Haikuo

View solution in original post


All Replies
Super User
Posts: 10,483

Re: How do I selectively suppress data?

What is the output supposed to look like? Are the values within a year set to missing?

I think your Rule 3 should read:

3. If any single category contains all of the data greater than 0, the entire group of data is suppressed.

Since SAS uses a special value to indicate missing 0 by default is data...

Occasional Contributor
Posts: 7

Re: How do I selectively suppress data?

A dash is used to indicate suppressed data.

That's a good point about rule 3.  If all data are zero, there is no need to suppress.  The dots to indicate missing are changed to zeroes.

Valued Guide
Posts: 3,208

Re: How do I selectively suppress data?

Is it just for presenting as a kind of alternate printing (format fcmp) or do you want to do calculations on the vaules. The answer to your question is "yes" it can be done with sas.

The next thing is what you want exactly in detail.

---->-- ja karman --<-----
Occasional Contributor
Posts: 7

Re: How do I selectively suppress data?

The data set with the suppressed values will be exported either to Excel or a text file.  The values need to be suppressed for the purpose of confidentiality of the students.  The finished product would look just like the sample table but with a dash replacing the suppressed values.

Solution
‎10-22-2014 03:47 PM
Respected Advisor
Posts: 3,124

Re: How do I selectively suppress data?

Not sure if this is what you need, but take a look,

Data Have;

     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);

     cards;

Washington HS                         1                      1               0          1               2

Washington HS                         2                      2               3          2               2

Washington HS                         3                      6               3          2               2

Washington HS                         4                      18              12         9               4

Monroe HS                               1                      10              12         4               6

Monroe HS                               2                      4               5          4               0

Monroe HS                               3                      8               7          9               5

Monroe HS                               4                      20              15         18         10

Truman HS                                1                      0               20         15         14

Truman HS                                2                      0               18         16         19

Truman HS                                3                      0               34         20         28

Truman HS                                4                      60              28         25         24

;

Run;

data want;

     do until (last.school);

           set have;

           by school notsorted;

           array y(4) y2011-y2014;

           array t(4) _temporary_;

           array m(4) _temporary_;

           do i=1 to 4;

                if y(i) <5 then

                     t(i)+y(i);

                else m(i)=min(m(i),y(i));

           end;

     end;

     do until (last.school);

           set have;

           by school notsorted;

           do i=1 to 4;

                if y(i) <5 then

                     y(i)=-1*y(i);

                else if y(i)=m(i) and .< t(i) < 5 then

                     y(i)=-1*y(i);

           end;

           output;

     end;

     call missing (of m(*), of t(*));

     drop i;

run;

Haikuo

Occasional Contributor
Posts: 7

Re: How do I selectively suppress data?

This works.  Thank you so much.  The only problem is that I need the suppressed value to be represented by a dash, not a negative number.  Since a dash is a character value, I added the following code:

data want1;

set want;

sy2011 = put(y2011,$4.0);

sy2012 = put(y2012,$4.0);

sy2013 = put(y2013,$4.0);

sy2014 = put(y2014,$4.0);

if y2011 < 1 then sy2011 = '-';

if y2012 < 1 then sy2012 = '-';

if y2013 < 1 then sy2013 = '-';

if y2014 < 1 then sy2014 = '-';

drop y2011-y2014;

run;

(By the way, how do I paste into this box without having to reformat?  Part of my response was cut off and extra lines were added between the code.)

Super User
Posts: 10,483

Re: How do I selectively suppress data?

Appearance of values is what custom formats are for. I would actually have been tempted to assign a special missing value (.A to .Z) and then have a format that displays that value as a dash. Then you don't need to proliferate variables and can still do arithmetic or summaries on the recoded variable.

And this editor REALLY sucks, apparently worse when using Internet Explorer, but paste in to Notepad or similar text editor (not Wordpad as it attempts to use HTML when pasted from the enhanced editor) and the copy/paste into this forum.

Super User
Posts: 9,671

Re: How do I selectively suppress data?

Very interesting question . But you didn't post the output yet ?

Data Have;
     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);
     cards;
Washington HS                         1                      1               0          1               2
Washington HS                         2                      2               3          2               2
Washington HS                         3                      6               3          2               2
Washington HS                         4                      18              12         9               4
Monroe HS                               1                      10              12         4               6
Monroe HS                               2                      4               5          4               0
Monroe HS                               3                      8               7          9               5
Monroe HS                               4                      20              15         18         10
Truman HS                                1                      0               20         15         14
Truman HS                                2                      0               18         16         19
Truman HS                                3                      0               34         20         28
Truman HS                                4                      60              28         25         24
;
Run;
options missing='-';
data want;
 set have;
 by School notsorted;
 retain found_2011      found_2012 found_2013 found_2014 sum_2011 sum_2012 sum_2013 sum_2014 ;
 if first.school then do;
   call missing(of found_:); 
   sum_2011=5; sum_2012=5;      sum_2013=5;       sum_2014=5;
 end;

 sum_2011+(-1*y2011);
 sum_2012+(-1*y2012);
 sum_2013+(-1*y2013);
 sum_2014+(-1*y2014);

 if not found_2011 then call missing(y2011);
 if not found_2012 then call missing(y2012);
 if not found_2013 then call missing(y2013);
 if not found_2014 then call missing(y2014);

 if sum_2011 le 0 then found_2011=1;
 if sum_2012 le 0 then found_2012=1;
 if sum_2013 le 0 then found_2013=1;
 if sum_2014 le 0 then found_2014=1;
drop sum_: found_: ;
run;

Xia Keshan

Occasional Contributor
Posts: 7

Re: How do I selectively suppress data?

Thanks for trying, Ksharp, but the output using your code is not entirely correct.

This is what the output is supposed to look like:

School  Proficiency_level y2011 y2012 y2013 y2014

Washington HS  1  - - - -

Washington HS  2  - - - -

Washington HS  3  - - - -

Washington HS  4  18 12 9 -

Monroe HS       1  10 12 - 6

Monroe HS       2  - 5 - -

Monroe HS       3  - 7 9 -

Monroe HS       4  20 15 18 10

Truman HS       1  - 20 15 14

Truman HS       1  - 18 16 19

Truman HS       1  - 34 20 28

Truman HS       1  - 28 25 24

Super User
Posts: 9,671

Re: How do I selectively suppress data?

Ha. It is clarified now .

Data Have;
     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);
     cards;
Washington HS                         1                      1               0          1               2
Washington HS                         2                      2               3          2               2
Washington HS                         3                      6               3          2               2
Washington HS                         4                      18              12         9               4
Monroe HS                               1                      10              12         4               6
Monroe HS                               2                      4               5          4               0
Monroe HS                               3                      8               7          9               5
Monroe HS                               4                      20              15         18         10
Truman HS                                1                      0               20         15         14
Truman HS                                2                      0               18         16         19
Truman HS                                3                      0               34         20         28
Truman HS                                4                      60              28         25         24
;
Run;
options missing='-';
data want;
 set have;
 by School notsorted;
 retain found_2011      found_2012 found_2013 found_2014 sum_2011 sum_2012 sum_2013 sum_2014
        lt_five_2011 lt_five_2012 lt_five_2013  lt_five_2014 ;
 if first.school then do;
   call missing(of found_: lt_five_:); 
   sum_2011=5; sum_2012=5;      sum_2013=5;       sum_2014=5;
 end;

if y2011 lt 5 then      lt_five_2011=1;
if y2012 lt 5 then      lt_five_2012=1;
if y2013 lt 5 then      lt_five_2013=1;
if y2014 lt 5 then      lt_five_2014=1;

if lt_five_2011 then sum_2011+(-1*y2011);
if lt_five_2012 then sum_2012+(-1*y2012);
if lt_five_2013 then sum_2013+(-1*y2013);
if lt_five_2014 then sum_2014+(-1*y2014);

 if not found_2011 and lt_five_2011 then call missing(y2011);
 if not found_2012 and lt_five_2012 then call missing(y2012);
 if not found_2013 and lt_five_2013 then call missing(y2013);
 if not found_2014 and lt_five_2014 then call missing(y2014);

 if sum_2011 le 0 and lt_five_2011 then found_2011=1;
 if sum_2012 le 0 and lt_five_2012 then found_2012=1;
 if sum_2013 le 0 and lt_five_2013 then found_2013=1;
 if sum_2014 le 0 and lt_five_2014 then found_2014=1;
drop sum_: found_: lt_five:;
run;

I don't understand

Washington HS  1  - - - -

Washington HS  2  - - - -

Washington HS  3  - - - -

Washington HS  4  18 12 9 4  <--why be missing

Xia Keshan

Occasional Contributor
Posts: 7

Re: How do I selectively suppress data?

Washington HS - 2014, level 4 - The 4 is suppressed because it is less than 5 - see rule # 1.

Super User
Posts: 9,671

Re: How do I selectively suppress data?

Ou, That is more complicated than I image. Try this one, and let me if it is worked.

Data Have;
     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);
     cards;
Washington HS                         1                      1               0          1               2
Washington HS                         2                      2               3          2               2
Washington HS                         3                      6               3          2               2
Washington HS                         4                      18              12         9               4
Monroe HS                               1                      10              12         4               6
Monroe HS                               2                      4               5          4               0
Monroe HS                               3                      8               7          9               5
Monroe HS                               4                      20              15         18         10
Truman HS                                1                      0               20         15         14
Truman HS                                2                      0               18         16         19
Truman HS                                3                      0               34         20         28
Truman HS                                4                      60              28         25         24
;
Run;
options missing='-';
data want;
 set have;
 by School notsorted;
 retain sum_2011 sum_2012 sum_2013 sum_2014
        lt_five_2011 lt_five_2012 lt_five_2013  lt_five_2014 ;
 if first.school then do;
   call missing(of lt_five_:); 
   sum_2011=5; sum_2012=5;      sum_2013=5;       sum_2014=5;
 end;

if y2011 lt 5 then      lt_five_2011=1;
if y2012 lt 5 then      lt_five_2012=1;
if y2013 lt 5 then      lt_five_2013=1;
if y2014 lt 5 then      lt_five_2014=1;

if lt_five_2011 then do;sum_2011+(-1*y2011); call missing(y2011); end;
if lt_five_2012 then do;sum_2012+(-1*y2012); call missing(y2012); end;
if lt_five_2013 then do;sum_2013+(-1*y2013); call missing(y2013); end;
if lt_five_2014 then do;sum_2014+(-1*y2014); call missing(y2014); end;

 if sum_2011 le 0 and lt_five_2011 then do;lt_five_2011=.;     sum_2011=5;end;
 if sum_2012 le 0 and lt_five_2012 then do;lt_five_2012=.;     sum_2012=5;end;
 if sum_2013 le 0 and lt_five_2013 then do;lt_five_2013=.;     sum_2013=5;end;
 if sum_2014 le 0 and lt_five_2014 then do;lt_five_2014=.;     sum_2014=5;end;
drop sum_: lt_five:;
run;

Xia Keshan

Super User
Posts: 9,671

Re: How do I selectively suppress data?

Or a simpler one :

Data Have;

     Input School:$&20.         Proficiency_level:$1.            (y2011-y2014) (:8.);

     cards;

Washington HS                         1                      1               0          1               2

Washington HS                         2                      2               3          2               2

Washington HS                         3                      6               3          2               2

Washington HS                         4                      18              12         9               4

Monroe HS                               1                      10              12         4               6

Monroe HS                               2                      4               5          4               0

Monroe HS                               3                      8               7          9               5

Monroe HS                               4                      20              15         18         10

Truman HS                                1                      0               20         15         14

Truman HS                                2                      0               18         16         19

Truman HS                                3                      0               34         20         28

Truman HS                                4                      60              28         25         24

;

Run;

options missing='-';

data want;

set have;

by School notsorted;

retain sum_2011 sum_2012 sum_2013 sum_2014

        lt_five_2011 lt_five_2012 lt_five_2013  lt_five_2014 ;

if first.school then do;

   call missing(of lt_five_Smiley Happy;

   sum_2011=5; sum_2012=5;      sum_2013=5;       sum_2014=5;

end;

if y2011 lt 5 then      lt_five_2011=1;

if y2012 lt 5 then      lt_five_2012=1;

if y2013 lt 5 then      lt_five_2013=1;

if y2014 lt 5 then      lt_five_2014=1;

if lt_five_2011 then do;sum_2011+(-1*y2011); call missing(y2011); end;

if lt_five_2012 then do;sum_2012+(-1*y2012); call missing(y2012); end;

if lt_five_2013 then do;sum_2013+(-1*y2013); call missing(y2013); end;

if lt_five_2014 then do;sum_2014+(-1*y2014); call missing(y2014); end;

if sum_2011 le 0  then do;lt_five_2011=.; sum_2011=5;end;

if sum_2012 le 0  then do;lt_five_2012=.; sum_2012=5;end;

if sum_2013 le 0  then do;lt_five_2013=.; sum_2013=5;end;

if sum_2014 le 0  then do;lt_five_2014=.; sum_2014=5;end;

drop sum_: lt_five:;

run;

Xia Keshan

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 536 views
  • 0 likes
  • 5 in conversation