BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michtka
Fluorite | Level 6

Hi everyone, I've got the next datset with 4 variables:

                    aesoc                                           pref_term             aeterm                            count

       Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1

       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6

       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1

       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1

         I would like to transform this dataset to another dataset with two columns :

                                           col0                                     col1     

          BLOOD AND LYMPHATIC SYSTEM DISORDERS

          Anaemia

          ANEMIA                                                               3                     

          ANEMY                                                                1

          Febrile neutropenia

          FEBRILE NEUTOPENIA                                         1

          FEBRILE NEUTROPENIA                                       1

          NEUTROPENIA FEBRILE                                       1

          Neutropenia

          NEUTROPENIA                                                     6   

          Thrombocythaemia

          PIASTRINOSIS                                                     1 

          THROMBOCYTOPENIA                                         1

I assume that I wil ned to create two new variables: group for aesoc, and subgroup for aept, to make this clasiffication in col0, but I would like you to help me

with this.

Thanks in advance,

J V.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Here's a simple version to try.

data want (keep=col0 col1);

   set have;

   by assoc pref_term aeterm count;

   if first.assoc then do;

      col0=assoc;

      output;

   end;

   if first.pref_term then do;

      col0 = pref_term;

      output;

   end;

   col0 = aeterm;

  col1 = count;

   output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

View solution in original post

4 REPLIES 4
PGStats
Opal | Level 21

try this:

%let nPos = 4; /* order of last column */

data have(keep=f:);
array f{&nPos.} $40;
infile datalines truncover;
input (f(*)) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
array f{*} f:;
length text number $ 40;
number = ""; text = "";
do place = 1 to &nPos.-2;
if lag(f(place)) ne f(place) or not missing(text) then do;
  text = f(place);
  output;
end;
end;
text = f(&nPos.-1);
number = f(&nPos);
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
michtka
Fluorite | Level 6

I would like to use a code withouth macrovariables and arrrays Smiley Sad....still looking for help,

but anyway...you did your best..thanks.

PGStats
Opal | Level 21

Here is my second best... (inferior solution, without arrays or macro variables)

data have(keep=f1-f4);
length f1-f4 $40;
infile datalines truncover;
input (f1-f4) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
length text number $ 40;
number = ""; text = "";
if lag(f1) ne f1 or not missing(text) then do;
text = f1;
output;
end;
if lag(f2) ne f2 or not missing(text) then do;
text = f2;
output;
end;
text = f3;
number = f4;
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
Astounding
PROC Star

Here's a simple version to try.

data want (keep=col0 col1);

   set have;

   by assoc pref_term aeterm count;

   if first.assoc then do;

      col0=assoc;

      output;

   end;

   if first.pref_term then do;

      col0 = pref_term;

      output;

   end;

   col0 = aeterm;

  col1 = count;

   output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1919 views
  • 3 likes
  • 3 in conversation