I am working with a transaction type data set where there are multiple observations for individual children. Each child has a unique "child_id" and there are multiple observations, for which the child has an associated value for the variable TREATMENT. There may or may not be repeats for the values of TREATMENT. I would like to create a sequence of variables whereby I can see all of the values of the TREATMENTS that a child has received. I have generated dummy variables based on all of the possible TREATMENT values with the following:
PROC SQL NOPRINT;
select distinct TREATMENT
into Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
TEMP{I}=0;
END;
All of the dummy variables, which correspond to the TREATMENT levels, currently have a zero. However, I am unable to figure out how to code these dummy variables so that the treatments the children have received can be identified.
Thank you for your help.
data treatments;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
PROC SQL NOPRINT;
select distinct TREATMENT
into :Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
if Treatment=vname(temp{i}) then TEMP{I}=1;
else TEMP{I}=0;
END;
run;
proc sql;
create table want as
select child,sum(A) as A,sum(B) as B,sum(c) as C,sum(D) as D
from list
group by child;
quit;
Sample input of what your data looks like and what you want your output to look like would be helpful.
My current data:
Child | Treatment |
1 | A |
1 | B |
1 | C |
2 | B |
2 | C |
3 | A |
3 | C |
3 | D |
What I want my data to look like:
Child | A | B | C | D |
1 | 1 | 1 | 1 | 0 |
2 | 0 | 1 | 1 | 0 |
3 | 1 | 0 | 1 | 1 |
data treatments;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
PROC SQL NOPRINT;
select distinct TREATMENT
into :Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
if Treatment=vname(temp{i}) then TEMP{I}=1;
else TEMP{I}=0;
END;
run;
proc sql;
create table want as
select child,sum(A) as A,sum(B) as B,sum(c) as C,sum(D) as D
from list
group by child;
quit;
Hello,
If you don't mind missing instead of 0 you can use transpose:
data have;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
data int;
retain x 1;
set have;
run;
proc transpose data=int out=want (drop=_NAME_);
by child;
id Treatment;
var x;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.