I've always gotten the help I needed here, so I'm asking for expertise once again.
I have data like this (sorry it's not that creative):
ID | Service | Date | Location |
A | cleaning | 5/22/2018 | 1 |
A | washing | 5/22/2018 | 1 |
A | cooking | 5/23/2018 | 2 |
B | organizing | 5/23/2018 | 3 |
B | washing | 5/23/2018 | 3 |
B | cooking | 5/23/2018 | 3 |
C | cooking | 5/21/2018 | 4 |
C | organizing | 5/22/2018 | 1 |
D | cleaning | 5/19/2018 | 1 |
D | washing | 5/20/2018 | 1 |
D | cooking | 5/21/2018 | 4 |
D | organizing | 5/21/2018 | 5 |
It includes other variables too, but ultimately, I want to be able to produce at least something like this:
ID | Service | Date | Location |
A | cleaning | 5/22/2018 | 1 |
A | cooking | 5/23/2018 | 2 |
B | organizing | 5/23/2018 | 3 |
C | cooking | 5/21/2018 | 4 |
C | organizing | 5/22/2018 | 1 |
D | cleaning | 5/19/2018 | 1 |
D | washing | 5/20/2018 | 1 |
D | cooking | 5/21/2018 | 4 |
D | organizing | 5/21/2018 | 5 |
It kept only one "service" (doesn't matter which one) per that day (Date), for that person (ID). I would normally use "if first.id;" in a data step, but I would'nt know how to do that involving per date and only for same locations. Notice ID "D" kept both services ("cooking" and "organizing") for the same date ("5/21/2018") because it happened at two different locations ("4" and "5").
If would be best though to have something like this, tabulated, if possible:
ID | Service | Date | Location | Sevice_2 | Location_2 | Service_3 | Location_3 |
A | cleaning | 5/22/2018 | 1 | washing | 1 | ||
A | cooking | 5/23/2018 | 2 | ||||
B | organizing | 5/23/2018 | 3 | washing | 3 | cooking | 3 |
C | cooking | 5/21/2018 | 4 | ||||
C | organizing | 5/22/2018 | 1 | ||||
D | cleaning | 5/19/2018 | 1 | ||||
D | washing | 5/20/2018 | 1 | ||||
D | cooking | 5/21/2018 | 4 | ||||
D | organizing | 5/21/2018 | 5 |
Thank you in advance for your input.
You were likely close but you didn't show your code so hard to tell:
Try something like this:
proc sort data=have; by id location date; run; data want; set have; by id location date; if first.location; run;
For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.
What will the next steps of your process be?
You were likely close but you didn't show your code so hard to tell:
Try something like this:
proc sort data=have; by id location date; run; data want; set have; by id location date; if first.location; run;
For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.
What will the next steps of your process be?
Notice that your preferred result only requires one Location field. Here's a way to get that:
proc transpose data=have prefix=Service_ out=want (drop=_name_);
by id date location;
var service;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.