- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I've always gotten the help I needed here, so I'm asking for expertise once again.
I have data like this (sorry it's not that creative):
ID | Service | Date | Location |
A | cleaning | 5/22/2018 | 1 |
A | washing | 5/22/2018 | 1 |
A | cooking | 5/23/2018 | 2 |
B | organizing | 5/23/2018 | 3 |
B | washing | 5/23/2018 | 3 |
B | cooking | 5/23/2018 | 3 |
C | cooking | 5/21/2018 | 4 |
C | organizing | 5/22/2018 | 1 |
D | cleaning | 5/19/2018 | 1 |
D | washing | 5/20/2018 | 1 |
D | cooking | 5/21/2018 | 4 |
D | organizing | 5/21/2018 | 5 |
It includes other variables too, but ultimately, I want to be able to produce at least something like this:
ID | Service | Date | Location |
A | cleaning | 5/22/2018 | 1 |
A | cooking | 5/23/2018 | 2 |
B | organizing | 5/23/2018 | 3 |
C | cooking | 5/21/2018 | 4 |
C | organizing | 5/22/2018 | 1 |
D | cleaning | 5/19/2018 | 1 |
D | washing | 5/20/2018 | 1 |
D | cooking | 5/21/2018 | 4 |
D | organizing | 5/21/2018 | 5 |
It kept only one "service" (doesn't matter which one) per that day (Date), for that person (ID). I would normally use "if first.id;" in a data step, but I would'nt know how to do that involving per date and only for same locations. Notice ID "D" kept both services ("cooking" and "organizing") for the same date ("5/21/2018") because it happened at two different locations ("4" and "5").
If would be best though to have something like this, tabulated, if possible:
ID | Service | Date | Location | Sevice_2 | Location_2 | Service_3 | Location_3 |
A | cleaning | 5/22/2018 | 1 | washing | 1 | ||
A | cooking | 5/23/2018 | 2 | ||||
B | organizing | 5/23/2018 | 3 | washing | 3 | cooking | 3 |
C | cooking | 5/21/2018 | 4 | ||||
C | organizing | 5/22/2018 | 1 | ||||
D | cleaning | 5/19/2018 | 1 | ||||
D | washing | 5/20/2018 | 1 | ||||
D | cooking | 5/21/2018 | 4 | ||||
D | organizing | 5/21/2018 | 5 |
Thank you in advance for your input.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You were likely close but you didn't show your code so hard to tell:
Try something like this:
proc sort data=have; by id location date; run; data want; set have; by id location date; if first.location; run;
For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.
What will the next steps of your process be?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You were likely close but you didn't show your code so hard to tell:
Try something like this:
proc sort data=have; by id location date; run; data want; set have; by id location date; if first.location; run;
For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.
What will the next steps of your process be?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Notice that your preferred result only requires one Location field. Here's a way to get that:
proc transpose data=have prefix=Service_ out=want (drop=_name_);
by id date location;
var service;
run;