Solved: Re: Keeping the first record for each date, per ID. But keeping them ...

ppphcyv · Posted 05-23-2018 01:19 PM

I've always gotten the help I needed here, so I'm asking for expertise once again.

I have data like this (sorry it's not that creative):

ID	Service	Date	Location
A	cleaning	5/22/2018	1
A	washing	5/22/2018	1
A	cooking	5/23/2018	2
B	organizing	5/23/2018	3
B	washing	5/23/2018	3
B	cooking	5/23/2018	3
C	cooking	5/21/2018	4
C	organizing	5/22/2018	1
D	cleaning	5/19/2018	1
D	washing	5/20/2018	1
D	cooking	5/21/2018	4
D	organizing	5/21/2018	5

It includes other variables too, but ultimately, I want to be able to produce at least something like this:

ID	Service	Date	Location
A	cleaning	5/22/2018	1
A	cooking	5/23/2018	2
B	organizing	5/23/2018	3
C	cooking	5/21/2018	4
C	organizing	5/22/2018	1
D	cleaning	5/19/2018	1
D	washing	5/20/2018	1
D	cooking	5/21/2018	4
D	organizing	5/21/2018	5

It kept only one "service" (doesn't matter which one) per that day (Date), for that person (ID). I would normally use "if first.id;" in a data step, but I would'nt know how to do that involving per date and only for same locations. Notice ID "D" kept both services ("cooking" and "organizing") for the same date ("5/21/2018") because it happened at two different locations ("4" and "5").

If would be best though to have something like this, tabulated, if possible:

ID	Service	Date	Location	Sevice_2	Location_2	Service_3	Location_3
A	cleaning	5/22/2018	1	washing	1
A	cooking	5/23/2018	2
B	organizing	5/23/2018	3	washing	3	cooking	3
C	cooking	5/21/2018	4
C	organizing	5/22/2018	1
D	cleaning	5/19/2018	1
D	washing	5/20/2018	1
D	cooking	5/21/2018	4
D	organizing	5/21/2018	5

Thank you in advance for your input.

ballardw · Posted 05-23-2018 01:33 PM

You were likely close but you didn't show your code so hard to tell:

Try something like this:

proc sort data=have;
   by id location date;
run;

data want;
  set have;
  by id location date;
  if first.location;
run;

For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.

What will the next steps of your process be?

View solution in original post

ballardw · Posted 05-23-2018 01:33 PM

You were likely close but you didn't show your code so hard to tell:

Try something like this:

proc sort data=have;
   by id location date;
run;

data want;
  set have;
  by id location date;
  if first.location;
run;

For almost anything except a proc print report you do not want to use that wide format. You have to keep track of a bunch of variables that are not populated for each resulting record, if you do this process with a new data set you may have a differing number of "wide" columns that means your code developed from the point of making the "wide" data set may not run because of referencing variables no longer there OR failing because it does not address added columns.

What will the next steps of your process be?

Astounding · Posted 05-23-2018 02:41 PM

Notice that your preferred result only requires one Location field. Here's a way to get that:

proc transpose data=have prefix=Service_ out=want (drop=_name_);

by id date location;

var service;

run;

Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

Re: Keeping the first record for each date, per ID. But keeping them for separate locations.

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away