I am working with a transaction type data set where there are multiple observations for individual children. Each child has a unique "child_id" and there are multiple observations, for which the child has an associated value for the variable TREATMENT. There may or may not be repeats for the values of TREATMENT. I would like to create a sequence of variables whereby I can see all of the values of the TREATMENTS that a child has received. I have generated dummy variables based on all of the possible TREATMENT values with the following:
PROC SQL NOPRINT;
select distinct TREATMENT
into Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
TEMP{I}=0;
END;
All of the dummy variables, which correspond to the TREATMENT levels, currently have a zero. However, I am unable to figure out how to code these dummy variables so that the treatments the children have received can be identified.
Thank you for your help.
data treatments;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
PROC SQL NOPRINT;
select distinct TREATMENT
into :Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
if Treatment=vname(temp{i}) then TEMP{I}=1;
else TEMP{I}=0;
END;
run;
proc sql;
create table want as
select child,sum(A) as A,sum(B) as B,sum(c) as C,sum(D) as D
from list
group by child;
quit;
Sample input of what your data looks like and what you want your output to look like would be helpful.
My current data:
Child | Treatment |
1 | A |
1 | B |
1 | C |
2 | B |
2 | C |
3 | A |
3 | C |
3 | D |
What I want my data to look like:
Child | A | B | C | D |
1 | 1 | 1 | 1 | 0 |
2 | 0 | 1 | 1 | 0 |
3 | 1 | 0 | 1 | 1 |
data treatments;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
PROC SQL NOPRINT;
select distinct TREATMENT
into :Treatment_List SEPARATED BY ' '
FROM treatments;
quit;
%PUT &treatment_list;
DATA LIST (DROP=I);
SET treatments;
ARRAY TEMP{*} &treatment_list;
DO I=1 TO DIM(TEMP);
if Treatment=vname(temp{i}) then TEMP{I}=1;
else TEMP{I}=0;
END;
run;
proc sql;
create table want as
select child,sum(A) as A,sum(B) as B,sum(c) as C,sum(D) as D
from list
group by child;
quit;
Hello,
If you don't mind missing instead of 0 you can use transpose:
data have;
input Child Treatment $;
datalines;
1 A
1 B
1 C
2 B
2 C
3 A
3 C
3 D
;
data int;
retain x 1;
set have;
run;
proc transpose data=int out=want (drop=_NAME_);
by child;
id Treatment;
var x;
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.