## group subgroup variables

Solved
Super Contributor
Posts: 301

# group subgroup variables

Hi everyone, I've got the next datset with 4 variables:

aesoc                                           pref_term             aeterm                            count

Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3

Blood and lymphatic system disorders    Anaemia                ANEMY                                  1

Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1

Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1

Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1

Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6

Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1

Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1

I would like to transform this dataset to another dataset with two columns :

col0                                     col1

BLOOD AND LYMPHATIC SYSTEM DISORDERS

Anaemia

ANEMIA                                                               3

ANEMY                                                                1

Febrile neutropenia

FEBRILE NEUTOPENIA                                         1

FEBRILE NEUTROPENIA                                       1

NEUTROPENIA FEBRILE                                       1

Neutropenia

NEUTROPENIA                                                     6

Thrombocythaemia

PIASTRINOSIS                                                     1

THROMBOCYTOPENIA                                         1

I assume that I wil ned to create two new variables: group for aesoc, and subgroup for aept, to make this clasiffication in col0, but I would like you to help me

with this.

J V.

Accepted Solutions
Solution
‎03-18-2012 05:36 PM
Super User
Posts: 6,778

## group subgroup variables

Here's a simple version to try.

data want (keep=col0 col1);

set have;

by assoc pref_term aeterm count;

if first.assoc then do;

col0=assoc;

output;

end;

if first.pref_term then do;

col0 = pref_term;

output;

end;

col0 = aeterm;

col1 = count;

output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

All Replies
Posts: 5,531

## group subgroup variables

try this:

%let nPos = 4; /* order of last column */

data have(keep=f;
array f{&nPos.} \$40;
infile datalines truncover;
input (f(*)) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
array f{*} f:;
length text number \$ 40;
number = ""; text = "";
do place = 1 to &nPos.-2;
if lag(f(place)) ne f(place) or not missing(text) then do;
text = f(place);
output;
end;
end;
text = f(&nPos.-1);
number = f(&nPos);
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
Super Contributor
Posts: 301

## group subgroup variables

I would like to use a code withouth macrovariables and arrrays ....still looking for help,

Posts: 5,531

## group subgroup variables

Here is my second best... (inferior solution, without arrays or macro variables)

data have(keep=f1-f4);
length f1-f4 \$40;
infile datalines truncover;
input (f1-f4) (&);
datalines;
Blood and lymphatic system disorders    Anaemia                ANEMIA                                 3
Blood and lymphatic system disorders    Anaemia                ANEMY                                  1
Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTOPENIA        1
Blood and lymphatic system disorders    Febrile neutropenia    FEBRILE NEUTROPENIA       1
Blood and lymphatic system disorders    Febrile neutropenia    NEUTROPENIA FEBRILE       1
Blood and lymphatic system disorders    Neutropenia            NEUTROPENIA                        6
Blood and lymphatic system disorders    Thrombocythaemia       PIASTRINOSIS                   1
Blood and lymphatic system disorders    Thrombocytopenia       THROMBOCYTOPENIA        1
;
run;

data want(keep=text number);
set have;
length text number \$ 40;
number = ""; text = "";
if lag(f1) ne f1 or not missing(text) then do;
text = f1;
output;
end;
if lag(f2) ne f2 or not missing(text) then do;
text = f2;
output;
end;
text = f3;
number = f4;
output;
run;

proc sql;
select text , number from want;
quit;

PG

PG
Solution
‎03-18-2012 05:36 PM
Super User
Posts: 6,778

## group subgroup variables

Here's a simple version to try.

data want (keep=col0 col1);

set have;

by assoc pref_term aeterm count;

if first.assoc then do;

col0=assoc;

output;

end;

if first.pref_term then do;

col0 = pref_term;

output;

end;

col0 = aeterm;

col1 = count;

output;

run;

I assume that's a typo in the last line, for pref_term.  Good luck.

🔒 This topic is solved and locked.