This may be a basic question but it has been difficult to find a solution on my own.
I have data that is organized into multiple groups, including trial, treatment, and replication. I wish to create a new variable for the entire data set using values from an observation presently assigned to a given treatment within a trial and replication (for later use as a covariate). This new variable should be assigned to all other observations with a given trial and replication.
Here is what I current have:
Data Have;
Input Trial Treatment Replication VariableX;
datalines;
1 1 1 a
1 1 2 b
1 2 1 c
1 2 2 d
1 3 1 e
1 3 2 f
2 1 1 g
2 1 2 h
2 2 1 i
2 2 2 j
2 3 1 k
2 3 2 l
;
This is what I want:
Data Want;
Input Trial Treatment Replication VariableX VariableY;
datalines;
1 1 1 a a
1 1 2 b b
1 2 1 c a
1 2 2 d b
1 3 1 e a
1 3 2 f b
2 1 1 g g
2 1 2 h h
2 2 1 i g
2 2 2 j h
2 3 1 k g
2 3 2 l h
;
Any help would be greatly appreciated.
The following code "saves" the observed value of VariableX from the first treatment for each combination of Test and Replication, and then pastes it to the original table as VariableY:
proc sql;
create table want as
select a.Trial, a.Treatment,a.Replication,a.VariableX, b.VariableX as VariableY
from (select Trial, Treatment,Replication,VariableX
from have) as a
inner join
(select Trial, Replication, VariableX /*Select the value of VariableX of the first observed(lowest value) treatment, for each combination of Trial and Replication*/
from have
group by Trial, Replication
having Treatment=min(Treatment)) as b on (a.Trial=b.Trial and a.Replication=b.Replication)
;
quit;
The following code "saves" the observed value of VariableX from the first treatment for each combination of Test and Replication, and then pastes it to the original table as VariableY:
proc sql;
create table want as
select a.Trial, a.Treatment,a.Replication,a.VariableX, b.VariableX as VariableY
from (select Trial, Treatment,Replication,VariableX
from have) as a
inner join
(select Trial, Replication, VariableX /*Select the value of VariableX of the first observed(lowest value) treatment, for each combination of Trial and Replication*/
from have
group by Trial, Replication
having Treatment=min(Treatment)) as b on (a.Trial=b.Trial and a.Replication=b.Replication)
;
quit;
I would do it like this:
Data Have;
Input Trial Treatment Replication VariableX $;
datalines;
1 1 1 a
1 1 2 b
1 2 1 c
1 2 2 d
1 3 1 e
1 3 2 f
2 1 1 g
2 1 2 h
2 2 1 i
2 2 2 j
2 3 1 k
2 3 2 l
;run;
proc sort data=have;
by trial replication treatment;
run;
data want;
set have;
by trial replication;
if first.replication then VariableY=VariableX;
retain VariableY;
run;
proc sort data=want;
by trial treatment replication;
run;
Data Have;
Input Trial Treatment Replication VariableX $;
datalines;
1 1 1 a
1 1 2 b
1 2 1 c
1 2 2 d
1 3 1 e
1 3 2 f
2 1 1 g
2 1 2 h
2 2 1 i
2 2 2 j
2 3 1 k
2 3 2 l
;
proc sql;
create table want as
select a.*,b.VariableX as VariableY
from have as a left join (select * from have where Treatment=1) as b
on a.Trial=b.Trial and a.Replication=b.Replication
order by 1,2,3
;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.