Help using Base SAS procedures

enumeration groups and subgroups of a dataset

Accepted Solution Solved
Reply
Super Contributor
Posts: 301
Accepted Solution

enumeration groups and subgroups of a dataset

Hi everyone, I've got the next datset( have) with 4 variables :

                    aesoc                                           pref_term             aeterm                            count

       Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA FEBRILE       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA                        6

       Blood                                                   Thrombocythaemia       PIASTRINOSIS                   1

       Blood                                                    Thrombocythaenia       THROMBOCYTOPENIA        1

I would like to add thes 3 new variables grp, sgrp and ssgrp and obtain this new dataset (final) with grp sgrp ssgrp col0 and col1, something like:

GRP SGRP SSGRP                     col 0    col1

     1       0      0                 Blood and lymphatic system disorders

      1       1     0                      anemia

       1       1     1                           ANEMIA      3                                       

       1       1      2                               ANEMY       1

        2      0       0                              Blood and lymphatic system

        2      1       0                          Febrile neutropenia

        2      1       1                         FEBRILE NEUTOPENIA 1

        2       1      2                              FEBRILE NEUTRPENIA 1

        3       0      0                                    Blood and lymphatic

        3       1      0                                   Neutropenia

        3       1      1                                        NEUTROPENIA FEBRILE 1

        3       1      2                                            NEUTROPENIA 6

        4       0      0                                                  blood

        4      1       0                                                  Thombocythaemia

        4      1       1                                                        PIASTRINOSIS 1

        4      1       2                                                          THROMBOCUTOPENIA 1

The dataset (want) with col0 and col1 has been generated (Astounding's code) by:

data want (keep=col0 col1);

   set have;

   by assoc pref_term aeterm count;

   if first.assoc then do;

      col0=assoc;

      output;

   end;

   if first.pref_term then do;

      col0 = pref_term;

      output;

   end;

   col0 = aeterm;

  col1 = count;

   output;

run;

Could you please add the three variables grp, sgrp and ssgrp to the above code to obtain teh right enumeration (grp sgrp and ssgrp) for the dataset (final) with these 5 variables: grp sgrp ssgrp col0 col1?

Thanks in advance.

J V.


Accepted Solutions
Solution
‎03-26-2012 10:22 PM
Respected Advisor
Posts: 3,124

enumeration groups and subgroups of a dataset

Minor tweak on Astounding's code will do:

data have;

input (aesoc pref_term aeterm) (:&$ 40.) count;

cards;

       Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA FEBRILE       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA                        6

       Blood                                                   Thrombocythaemia       PIASTRINOSIS                   1

       Blood                                                    Thrombocythaenia       THROMBOCYTOPENIA        1

       ;

proc sort data=have;

by  desending aesoc pref_term aeterm count;;

run;

data want (keep=col1 col0 grp sgrp ssgrp) ;

   retain  grp sgrp ssgrp 0;

   set have;

   length col0 col1 $40;

   by  descending aesoc pref_term aeterm count;

   if first.aesoc then do;

      sgrp=0;

      ssgrp=0;

      col0=aesoc;

      grp+1;

      output;

   end;

   if first.pref_term then do;

     ssgrp=0;

      col0 = pref_term;

      sgrp+1;

      output;

   end;

   if first.aeterm then do;

   ssgrp+1;

   col0 = aeterm;

  col1 = count;

   output;

   end;

run;

proc print;run;

Regards,

Haikuo

View solution in original post


All Replies
Solution
‎03-26-2012 10:22 PM
Respected Advisor
Posts: 3,124

enumeration groups and subgroups of a dataset

Minor tweak on Astounding's code will do:

data have;

input (aesoc pref_term aeterm) (:&$ 40.) count;

cards;

       Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system                  Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA FEBRILE       1

       Blood and lymphatic                              neutropenia            NEUTROPENIA                        6

       Blood                                                   Thrombocythaemia       PIASTRINOSIS                   1

       Blood                                                    Thrombocythaenia       THROMBOCYTOPENIA        1

       ;

proc sort data=have;

by  desending aesoc pref_term aeterm count;;

run;

data want (keep=col1 col0 grp sgrp ssgrp) ;

   retain  grp sgrp ssgrp 0;

   set have;

   length col0 col1 $40;

   by  descending aesoc pref_term aeterm count;

   if first.aesoc then do;

      sgrp=0;

      ssgrp=0;

      col0=aesoc;

      grp+1;

      output;

   end;

   if first.pref_term then do;

     ssgrp=0;

      col0 = pref_term;

      sgrp+1;

      output;

   end;

   if first.aeterm then do;

   ssgrp+1;

   col0 = aeterm;

  col1 = count;

   output;

   end;

run;

proc print;run;

Regards,

Haikuo

Super Contributor
Posts: 301

enumeration groups and subgroups of a dataset

Hi Haikuo, as you can see is not right, Febrile neutropenia (record 5) has to be 2 1 0 and so on, still of 1 2 0...could you ammend it , please?

Thnaks.

             1     1       0       0      Blood and lymphatic system disorders

              2     1       1       0      Anaemia

              3     1       1       1      ANEMIA                                   3

              4     1       1       2      ANEMY                                    1

              5     1       2       0      Febrile neutropenia

              6     1       2       1      FEBRILE NEUTOPENIA                       1

              7     1       2       2      FEBRILE NEUTROPENIA                      1

              8     1       2       3      NEUTROPENIA FEBRILE                      1

              9     1       3       0      Neutropenia

             10     1       3       1      NEUTROPENIA                              6

             11     1       4       0      Thrombocythaemia

             12     1       4       1      PIASTRINOSIS                             1

             13     1       5       0      Thrombocytopenia

             14     1       5       1      THROMBOCYTOPENIA                         1

Respected Advisor
Posts: 3,124

enumeration groups and subgroups of a dataset

OK, first let 's sort this out: what version of data you are using? Apparantly the data you are showing right now can not come from the raw data of your first post.

Second, you need to let me know what is the underline meaning of these 3 new variables. I thought that they are corresponding to the group level.

Super Contributor
Posts: 301

enumeration groups and subgroups of a dataset

data have(rename=(f1=aesoc f2=pref_term f3=aeterm f4=count)keep=f1-f4);

length f1-f4 $40;

infile datalines truncover;

input (f1-f4) (&);

datalines;

Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1

       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6

       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1

       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1

;

run;

Please, se above...this is the raw of data, and after aplying your code:

I obtained the above sequence.

Could you amend this?

Super Contributor
Posts: 301

enumeration groups and subgroups of a dataset

YES! YOU RIGHT.

Sorry, I used a different raw dataset.

My apologyze.

Thnaks very much for your help.

J V

Super Contributor
Posts: 301

enumeration groups and subgroups of a dataset

Thnak you again Haikuo, It works perfectly, thanks.

             1     1       0       0      Blood and lymphatic system disorders

              2     1       1       0      Anaemia

              3     1       1       1      ANEMIA                                   3

              4     1       1       2      ANEMY                                    1

              5     2       0       0      Blood and lymphatic system

              6     2       1       0      Febrile neutropenia

              7     2       1       1      FEBRILE NEUTOPENIA                       1

              8     2       1       2      FEBRILE NEUTROPENIA                      1

              9     3       0       0      Blood and lymphatic

             10     3       1       0      Febrile neutropenia

             11     3       1       1      NEUTROPENIA FEBRILE                      1

             12     3       2       0      Neutropenia

             13     3       2       1      NEUTROPENIA                              6

             14     4       0       0      Blood

             15     4       1       0      Thrombocythaemia

             16     4       1       1      PIASTRINOSIS                             1

             17     4       2       0      Thrombocytopenia

             18     4       2       1      THROMBOCYTOPENIA                         1

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 900 views
  • 0 likes
  • 2 in conversation