BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
grande44
Calcite | Level 5

Hi everyone! I've been given a dataset where the first 3 columns are: Group, Subgroup, Code. As you can see, the way this person formatted it is that the first observation of a Code within a Group and Subgroup gives values for the Group and Subgroup. Then the following observations in the Group and Subgroup have these fields left blank. I have put an example of what the dataset looks like below.

How can I tell SAS to apply the correct names of the Group and Subgroup to each observation? So that, for example, if I make a new dataset and include a statement that says IF GROUP="AI"; statement, will the Codes 2262, 2271, and 2272 be included?

The first column is the name of the Group, the second column is the name of the Subgroup

1AIAZ2201
22262
32271
42272
5AOAO341
6414
7BB731
82262

Thank you for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

Often when data is imported from excel with merged-cells this type of missing data pattern will created.  You can use the features of the UPDATE statement to fill of LOCF the missing values. You just need to identify the variables to be operated on and create dummy BY variable a "necessary evil".  I added that variable with a data step view.

data mergedcell;
   infile cards dsd;
  
input row group $ subgroup $ value;
   cards;
1,AI,AZ,2201
2,,,2262
3,,,2271
4,,,2272
5,AO,AO,341
6,,,414
7,,BB,731
8,,,2262
9,,,
;;;;
   run;
proc print;
  
run;
data v / view=v;
   set mergedcell;
   retain dummy 1;
  
run;
%let by = dummy;
%let fill = group subgroup;
data filled;
   if 0 then set mergedcell;
   update v(obs=0 keep=&by) v(keep=&by &fill);
   by &by;
   set mergedcell(drop=&fill);
   output;
  
drop dummy;
   run;
proc print;
  
run;
   
6-10-2015 11-13-39 AM.png

View solution in original post

3 REPLIES 3
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, as long as they are in the order required, then retain should work.  (note, for future if you could put data in a datastep to reduce the amount of typing).

data have;

group="AI";

subgroup="AZ";

val=2201; output;

group=""; subgroup="";

val=2262; output;

val=2271; output;

val=2272; output;

group="AO";

subgroup="AO";

val=341; output;

group=""; subgroup="";

val=414; output;

subgroup="BB";

val=731; output;

subgroup="";

val=2262; output;

run;

data want (drop=lstgroup lstsubgroup);

  set have;

  retain lstgroup lstsubgroup;

  if group ne "" then lstgroup=group;

  else group=lstgroup;

  if subgroup ne "" then lstsubgroup=subgroup;

  else subgroup=lstsubgroup;

run;

PGStats
Opal | Level 21

There is always the auto-update trick (learned from Tom) :

data have;

infile datalines truncover;

length group subgroup $4;

input obs 1-4 group 5-8 subgroup 9-12 x 13-16;

dum = 1;

datalines;

1   AI  AZ  2201

2           2262

3           2271

4           2272

5   AO  AO  341

6           414

7       BB  731

8           2262

;

data want;

update have(obs=0) have;

by dum;

output;

drop dum;

run;

PG

PG
data_null__
Jade | Level 19

Often when data is imported from excel with merged-cells this type of missing data pattern will created.  You can use the features of the UPDATE statement to fill of LOCF the missing values. You just need to identify the variables to be operated on and create dummy BY variable a "necessary evil".  I added that variable with a data step view.

data mergedcell;
   infile cards dsd;
  
input row group $ subgroup $ value;
   cards;
1,AI,AZ,2201
2,,,2262
3,,,2271
4,,,2272
5,AO,AO,341
6,,,414
7,,BB,731
8,,,2262
9,,,
;;;;
   run;
proc print;
  
run;
data v / view=v;
   set mergedcell;
   retain dummy 1;
  
run;
%let by = dummy;
%let fill = group subgroup;
data filled;
   if 0 then set mergedcell;
   update v(obs=0 keep=&by) v(keep=&by &fill);
   by &by;
   set mergedcell(drop=&fill);
   output;
  
drop dummy;
   run;
proc print;
  
run;
   
6-10-2015 11-13-39 AM.png

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 961 views
  • 8 likes
  • 4 in conversation