Solved: What is wrong with my retain code?

Datino · Posted 02-09-2022 06:23 PM

My data looks like this:

I'm trying to make one observation for each by group with the values from the variables that are not missing. For instance the observation for the first group should be:

col0001 col0002 col0003 col0004 col0005 col0006 col0007 col0008 col0009 col0010 col0011 col0012
ID001002 Event1 03MAY2021 (54) 1 No No Yes No test 14 13

This is my code:

data support;
	set data;
	by col0001 col0002 col0003 col0004;

	retain col_0006 col_0007 col_0008 col_0009 col_0010 col_0011 col_0012 col_0013 col_0014;

	if first.col0004 then
	do;
		col_0006="";
		col_0007="";
		col_0008="";
		col_0009="";
		col_0010="";
		col_0011="";
		col_0012="";
		col_0013="";
		col_0014="";
	end;

	if not missing (col0006) then col_0006= col0006;
	if not missing (col0007) then col_0007= col0007;
	if not missing (col0008) then col_0008= col0008;
	if not missing (col0009) then col_0009= col0009;
	if not missing (col0010) then col_0010= col0010;
	if not missing (col0011) then col_0011= col0011;
	if not missing (col0012) then col_0012= col0012;
	if not missing (col0013) then col_0013= col0013;
	if not missing (col0014) then col_0014= col0014;	

	if last.col0004 then
	output;
run;

And this is the output:

What am I doing wrong?

ballardw · Posted 02-09-2022 08:12 PM

Your source data set, at least in the first pictures has variables that are on your RETAIN statement. Example Col_0006.

As such every time the data vector is read the value read from the existing data replaces the "retained" value.

So you don't get to change any of the retained variables.

Maybe this small example will demonstrate;

data have;
  input x;
datalines;
1
2
3
;

data want;
   set have;
   retain x newx;
   newx= x;
   x= x+6;
run;

You can see that Newx has the value from the Have dataset, indicating the value of X is reset when the next record is read.

View solution in original post

Reeza · Posted 02-09-2022 06:46 PM

I can't comment on the code, besides requesting the log as the code looks correct. My next guess would be that those cells that look missing may not be - issue with character fields sometimes. Look at the COMPRESS() function with the s modifier to remove those 'invisible blanks'.

However, if this is your goal:

I'm trying to make one observation for each by group with the values from the variables that are not missing.

Does something like this work though, SQL functions work on character columns so it's a nice lazy way to avoid this.

proc sql;
create table want as
select col0001, col0002, col0003, col0004, max(col0005) as col0005, max(col0006) as col0006
from have
group by 1, 2, 3, 4;
quit;

Or this:

data want2;
update have(obs=0) have;
by col001 col002 col003 col004;
run;

@Datino wrote:

My data looks like this:

I'm trying to make one observation for each by group with the values from the variables that are not missing. For instance the observation for the first group should be:

col0001 col0002 col0003 col0004 col0005 col0006 col0007 col0008 col0009 col0010 col0011 col0012
ID001002 Event1 03MAY2021 (54) 1 No No Yes No test 14 13

This is my code:
data support;
	set data;
	by col0001 col0002 col0003 col0004;

	retain col_0006 col_0007 col_0008 col_0009 col_0010 col_0011 col_0012 col_0013 col_0014;

	if first.col0004 then
	do;
		col_0006="";
		col_0007="";
		col_0008="";
		col_0009="";
		col_0010="";
		col_0011="";
		col_0012="";
		col_0013="";
		col_0014="";
	end;

	if not missing (col0006) then col_0006= col0006;
	if not missing (col0007) then col_0007= col0007;
	if not missing (col0008) then col_0008= col0008;
	if not missing (col0009) then col_0009= col0009;
	if not missing (col0010) then col_0010= col0010;
	if not missing (col0011) then col_0011= col0011;
	if not missing (col0012) then col_0012= col0012;
	if not missing (col0013) then col_0013= col0013;
	if not missing (col0014) then col_0014= col0014;	

	if last.col0004 then
	output;
run;
And this is the output:

What am I doing wrong?

Datino · Posted 02-10-2022 09:46 PM

Thanks. I'm trying to avoid using sql as a learning exercise.

ballardw · Posted 02-09-2022 08:12 PM

Your source data set, at least in the first pictures has variables that are on your RETAIN statement. Example Col_0006.

As such every time the data vector is read the value read from the existing data replaces the "retained" value.

So you don't get to change any of the retained variables.

Maybe this small example will demonstrate;

data have;
  input x;
datalines;
1
2
3
;

data want;
   set have;
   retain x newx;
   newx= x;
   x= x+6;
run;

You can see that Newx has the value from the Have dataset, indicating the value of X is reset when the next record is read.

Datino · Posted 02-10-2022 09:43 PM

So, the solution is to create new variables that are not in the source data set?

s_lassen · Posted 02-10-2022 09:39 AM

As @ballardw remarked, the variables that you retain are also on the input data. To make your code work, you can do something like this:

data support;
  set data(drop=col_:);
  by col0001 col0002 col0003 col0004;
  if 0 then set data(keep=col_:);

And then the rest of your code unchanged. This way, the lengths of the COL_ variables are set by the second SET statement (which is never executed), just a lazy way of defining them without actually reading them.

Datino · Posted 02-10-2022 09:45 PM

Thanks, your code worked very well, although I'm not completely sure I understand what the if statement does.

 if 0 then set data(keep=col_:);

s_lassen · Posted 02-18-2022 10:54 AM

The IF 0 THEN statement is, as I said, a lazy way of defining the variables mentioned in the KEEP= option. The data are never read, but the variables still get defined.

What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Re: What is wrong with my retain code?

Register Today!

SAS Training: Just a Click Away