BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
michtka
Fluorite | Level 6

Hi everyone, I've got the next datset with 4 variables:

                    aesoc                                           pref_term             aeterm                            count

       Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1

       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1

       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1

       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6

       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1

       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1

         I would like to transform this dataset to another dataset with two columns :

                                           col0                                     col1     

          BLOOD AND LYMPHATIC SYSTEM DISORDERS

          Anaemia

          ANEMIA                                                               3                     

          ANEMY                                                                1

          Febrile neutropenia

          FEBRILE NEUTOPENIA                                         1

          FEBRILE NEUTROPENIA                                       1

          NEUTROPENIA FEBRILE                                       1

          Neutropenia

          NEUTROPENIA                                                     6   

          Thrombocythaemia

          PIASTRINOSIS                                                     1 

          THROMBOCYTOPENIA                                         1

I assume that I wil ned to create two new variables: group for aesoc, and subgroup for aept, to make this clasiffication in col0, but I would like you to help me

with this.

Thanks in advance,

J V.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Here's a simple version to try.

data want (keep=col0 col1);

   set have;

   by assoc pref_term aeterm count;

   if first.assoc then do;

      col0=assoc;

      output;

   end;

   if first.pref_term then do;

      col0 = pref_term;

      output;

   end;

   col0 = aeterm;

  col1 = count;

   output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

View solution in original post

4 REPLIES 4
PGStats
Opal | Level 21

try this:

%let nPos = 4; /* order of last column */

data have(keep=f:);
array f{&nPos.} $40;
infile datalines truncover;
input (f(*)) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
array f{*} f:;
length text number $ 40;
number = ""; text = "";
do place = 1 to &nPos.-2;
if lag(f(place)) ne f(place) or not missing(text) then do;
  text = f(place);
  output;
end;
end;
text = f(&nPos.-1);
number = f(&nPos);
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
michtka
Fluorite | Level 6

I would like to use a code withouth macrovariables and arrrays Smiley Sad....still looking for help,

but anyway...you did your best..thanks.

PGStats
Opal | Level 21

Here is my second best... (inferior solution, without arrays or macro variables)

data have(keep=f1-f4);
length f1-f4 $40;
infile datalines truncover;
input (f1-f4) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
       Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
       Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
       Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
       Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
       Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
       Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
length text number $ 40;
number = ""; text = "";
if lag(f1) ne f1 or not missing(text) then do;
text = f1;
output;
end;
if lag(f2) ne f2 or not missing(text) then do;
text = f2;
output;
end;
text = f3;
number = f4;
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
Astounding
PROC Star

Here's a simple version to try.

data want (keep=col0 col1);

   set have;

   by assoc pref_term aeterm count;

   if first.assoc then do;

      col0=assoc;

      output;

   end;

   if first.pref_term then do;

      col0 = pref_term;

      output;

   end;

   col0 = aeterm;

  col1 = count;

   output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1657 views
  • 3 likes
  • 3 in conversation