Solved: Applying group to variables

grande44 · Posted 06-10-2015 10:40 AM

Hi everyone! I've been given a dataset where the first 3 columns are: Group, Subgroup, Code. As you can see, the way this person formatted it is that the first observation of a Code within a Group and Subgroup gives values for the Group and Subgroup. Then the following observations in the Group and Subgroup have these fields left blank. I have put an example of what the dataset looks like below.

How can I tell SAS to apply the correct names of the Group and Subgroup to each observation? So that, for example, if I make a new dataset and include a statement that says IF GROUP="AI"; statement, will the Codes 2262, 2271, and 2272 be included?

The first column is the name of the Group, the second column is the name of the Subgroup

1	AI	AZ	2201
2			2262
3			2271
4			2272
5	AO	AO	341
6			414
7		BB	731
8			2262

Thank you for your help!

data_null__ · Posted 06-10-2015 12:15 PM

Often when data is imported from excel with merged-cells this type of missing data pattern will created. You can use the features of the UPDATE statement to fill of LOCF the missing values. You just need to identify the variables to be operated on and create dummy BY variable a "necessary evil". I added that variable with a data step view.

data mergedcell;
   infile cards dsd; 
   input row group $ subgroup $ value;
   cards; 
1,AI,AZ,2201
2,,,2262
3,,,2271
4,,,2272
5,AO,AO,341
6,,,414
7,,BB,731
8,,,2262
9,,,
;;;;
   run; 
proc print; 
   run; 
data v / view=v;
   set mergedcell;
   retain dummy 1; 
   run; 
%let by = dummy;
%let fill = group subgroup;
data filled;
   if 0 then set mergedcell;
   update v(obs=0 keep=&by) v(keep=&by &fill);
   by &by;
   set mergedcell(drop=&fill);
   output; 
   drop dummy;
   run; 
proc print; 
   run; 
    

View solution in original post

RW9 · Posted 06-10-2015 11:19 AM

Well, as long as they are in the order required, then retain should work. (note, for future if you could put data in a datastep to reduce the amount of typing).

data have;

group="AI";

subgroup="AZ";

val=2201; output;

group=""; subgroup="";

val=2262; output;

val=2271; output;

val=2272; output;

group="AO";

subgroup="AO";

val=341; output;

group=""; subgroup="";

val=414; output;

subgroup="BB";

val=731; output;

subgroup="";

val=2262; output;

run;

data want (drop=lstgroup lstsubgroup);

set have;

retain lstgroup lstsubgroup;

if group ne "" then lstgroup=group;

else group=lstgroup;

if subgroup ne "" then lstsubgroup=subgroup;

else subgroup=lstsubgroup;

run;

PGStats · Posted 06-10-2015 11:52 AM

There is always the auto-update trick (learned from Tom) :

data have;

infile datalines truncover;

length group subgroup $4;

input obs 1-4 group 5-8 subgroup 9-12 x 13-16;

dum = 1;

datalines;

1 AI AZ 2201

2 2262

3 2271

4 2272

5 AO AO 341

6 414

7 BB 731

8 2262

;

data want;

update have(obs=0) have;

by dum;

output;

drop dum;

run;

PG

data_null__ · Posted 06-10-2015 12:15 PM

Often when data is imported from excel with merged-cells this type of missing data pattern will created. You can use the features of the UPDATE statement to fill of LOCF the missing values. You just need to identify the variables to be operated on and create dummy BY variable a "necessary evil". I added that variable with a data step view.

data mergedcell;
   infile cards dsd; 
   input row group $ subgroup $ value;
   cards; 
1,AI,AZ,2201
2,,,2262
3,,,2271
4,,,2272
5,AO,AO,341
6,,,414
7,,BB,731
8,,,2262
9,,,
;;;;
   run; 
proc print; 
   run; 
data v / view=v;
   set mergedcell;
   retain dummy 1; 
   run; 
%let by = dummy;
%let fill = group subgroup;
data filled;
   if 0 then set mergedcell;
   update v(obs=0 keep=&by) v(keep=&by &fill);
   by &by;
   set mergedcell(drop=&fill);
   output; 
   drop dummy;
   run; 
proc print; 
   run; 
    

Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Re: Applying group to variables

Registration is open

SAS Training: Just a Click Away