Solved: Assign Existing Variable Values to All Other Observations Within Group...

gentd · Posted 05-26-2021 09:07 PM

This may be a basic question but it has been difficult to find a solution on my own.

I have data that is organized into multiple groups, including trial, treatment, and replication. I wish to create a new variable for the entire data set using values from an observation presently assigned to a given treatment within a trial and replication (for later use as a covariate). This new variable should be assigned to all other observations with a given trial and replication.

Here is what I current have:

Data Have;

Input Trial Treatment Replication VariableX;

datalines;

1 1 1 a

1 1 2 b

1 2 1 c

1 2 2 d

1 3 1 e

1 3 2 f

2 1 1 g

2 1 2 h

2 2 1 i

2 2 2 j

2 3 1 k

2 3 2 l

;

This is what I want:

Data Want;

Input Trial Treatment Replication VariableX VariableY;

datalines;

1 1 1 a a

1 1 2 b b

1 2 1 c a

1 2 2 d b

1 3 1 e a

1 3 2 f b

2 1 1 g g

2 1 2 h h

2 2 1 i g

2 2 2 j h

2 3 1 k g

2 3 2 l h

;

Any help would be greatly appreciated.

Angel_Larrion · Posted 05-26-2021 09:54 PM

The following code "saves" the observed value of VariableX from the first treatment for each combination of Test and Replication, and then pastes it to the original table as VariableY:

proc sql;
create table want as
select a.Trial, a.Treatment,a.Replication,a.VariableX, b.VariableX as VariableY
from (select Trial, Treatment,Replication,VariableX 
	  from have) as a                                   

		inner join 

	(select Trial, Replication, VariableX      /*Select the value of VariableX of the first observed(lowest value) treatment, for each combination of Trial and Replication*/
	 from have
	 group by Trial, Replication
	 having Treatment=min(Treatment)) as b on  (a.Trial=b.Trial and a.Replication=b.Replication)    
;
quit;

View solution in original post

Angel_Larrion · Posted 05-26-2021 09:54 PM

The following code "saves" the observed value of VariableX from the first treatment for each combination of Test and Replication, and then pastes it to the original table as VariableY:

proc sql;
create table want as
select a.Trial, a.Treatment,a.Replication,a.VariableX, b.VariableX as VariableY
from (select Trial, Treatment,Replication,VariableX 
	  from have) as a                                   

		inner join 

	(select Trial, Replication, VariableX      /*Select the value of VariableX of the first observed(lowest value) treatment, for each combination of Trial and Replication*/
	 from have
	 group by Trial, Replication
	 having Treatment=min(Treatment)) as b on  (a.Trial=b.Trial and a.Replication=b.Replication)    
;
quit;

gentd · Posted 05-27-2021 02:25 PM

Thank you Angel_Larrion! My actual data set has numeric values for 'Treatment' but I was above to modify your code to select the values for the 'Non-Treated' treatment of interest.

mkeintz · Posted 05-26-2021 11:40 PM

The below is submitted via my smartphone which is not offered a code submission option on this website.

data Have;
input Trial Treatment Replication VariableX;
datalines;
1 1 1 a
1 1 2 b
1 2 1 c
1 2 2 d
1 3 1 e
1 3 2 f
2 1 1 g
2 1 2 h
2 2 1 i
2 2 2 j
2 3 1 k
2 3 2 l
;

data want (drop=_:);
set have;
by trial treatment;
retain _get . ;
array tmp{10} _temporary_;
if first.trial then call missing (_get, of tmp{*});
else if first.treatment then _get=1;
if _get=1 then variablex=tmp[replication];
else tmp[replication]=variablex;
run;

The above accommodates replications 1 through 10.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

s_lassen · Posted 05-27-2021 02:24 AM

I would do it like this:

Data Have;
Input Trial Treatment Replication VariableX $;
datalines;
1 1  1 a
1 1  2 b
1 2  1 c
1 2  2 d
1 3  1 e
1 3  2 f
2 1  1 g
2 1  2 h
2 2  1 i
2 2  2 j
2 3  1 k
2 3  2 l
;run;

proc sort data=have;
  by trial replication treatment;
run;

data want;
  set have;
  by trial replication;
  if first.replication then VariableY=VariableX;
  retain VariableY;
run;

proc sort data=want;
  by trial treatment replication;
run;

Ksharp · Posted 05-27-2021 08:33 AM

Data Have;
Input Trial Treatment Replication VariableX $;
datalines;
1 1  1 a
1 1  2 b
1 2  1 c
1 2  2 d
1 3  1 e
1 3  2 f
2 1  1 g
2 1  2 h
2 2  1 i
2 2  2 j
2 3  1 k
2 3  2 l
;

proc sql;
create table want as
select a.*,b.VariableX as VariableY
 from have as a left join (select * from have where Treatment=1) as b 
  on a.Trial=b.Trial and a.Replication=b.Replication
   order by 1,2,3
  ;
quit;

Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Re: Assign Existing Variable Values to All Other Observations Within Groups

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!