Hello,
I am working on a data set in which a person has multiple observations (rows). A person can retrieve balls that vary in color. A person can also transfer and re-store balls in the same bin. I need to define a variable that indicates the color group, defined as the color of the first retrieval. If there is another retrieval at a later date, the color group will change to the color of the second retrieval, for all remaining iterations, and so on for more retrievals.
The data looks as follows:
ID | visit | num_retrieved | num_transferred | num_restored | retrieval_interation | color | group (need) |
1 | 1 | 3 | 2 | 1 | 1 | blue | blue |
1 | 2 | 0 | 1 | 0 | blue | ||
1 | 3 | 2 | 0 | 2 | 2 | red | red |
1 | 4 | 0 | 0 | 0 | red | ||
2 | 1 | 0 | 0 | 0 | |||
2 | 2 | 1 | 0 | 1 | 1 | green | green |
2 | 3 | 0 | 1 | 0 | green | ||
3 | 1 | 7 | 2 | 5 | 1 | red | red |
3 | 2 | 2 | 4 | 3 | 2 | green | green |
if anyone can help me get the group column above, that would be greatly appreciated! Thanks in advance!
data have;
input id visit num_retrieved num_transferred num_restored retrieval_iteration color $;
datalines;
1 1 3 2 1 1 blue
1 2 0 1 0 . ""
1 3 2 0 2 2 red
1 4 0 0 0 . ""
2 1 0 0 0 . ""
2 2 1 0 1 1 green
2 3 0 1 0 . ""
3 1 7 2 5 1 red
3 2 2 4 3 2 green
;
run;
data have;
set have;
if color = '""' then color = "";
run;
Try this
data have;
input id visit num_retrieved num_transferred num_restored retrieval_iteration color $;
infile datalines missover;
datalines;
1 1 3 2 1 1 blue
1 2 0 1 0 .
1 3 2 0 2 2 red
1 4 0 0 0 .
2 1 0 0 0 .
2 2 1 0 1 1 green
2 3 0 1 0 .
3 1 7 2 5 1 red
3 2 2 4 3 2 green
;
data want;
set have;
by id;
length group $ 8;
if first.id then group = '';
if color ne '' then group = color;
retain group;
run;
Why does retrieval_interation contain the color values suddenly? Is that a typo?
Try this
data have;
input id visit num_retrieved num_transferred num_restored retrieval_iteration color $;
infile datalines missover;
datalines;
1 1 3 2 1 1 blue
1 2 0 1 0 .
1 3 2 0 2 2 red
1 4 0 0 0 .
2 1 0 0 0 .
2 2 1 0 1 1 green
2 3 0 1 0 .
3 1 7 2 5 1 red
3 2 2 4 3 2 green
;
data want;
set have;
by id;
length group $ 8;
if first.id then group = '';
if color ne '' then group = color;
retain group;
run;
Note you can use period to mark missing character values also. The normal character informat will convert a single period to blank(s).
datalines;
1 1 3 2 1 1 blue
1 2 0 1 0 . .
1 3 2 0 2 2 red
1 4 0 0 0 . .
2 1 0 0 0 . .
2 2 1 0 1 1 green
2 3 0 1 0 . .
3 1 7 2 5 1 red
3 2 2 4 3 2 green
;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.