- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello all,
I've been racking my brain on this data manipulation and it is intuitively easy, but I'm having trouble implementing it. I have something like the following table:
ID | Entry | Exit |
1 | 0 | 1 |
1 | 1 | 2 |
1 | 3 | 2 |
2 | 0 | 5 |
2 | 5 | 6 |
2 | 6 | 7 |
2 | 7 | 8 |
3 | 0 | 3 |
3 | 3 | 4 |
3 | 5 | 4 |
This is incorrect as Entry should always be LESS THAN Exit and the Exit value for the previous row should be the Entry value for the subsequent row. The following table is correct:
ID | Entry | Exit |
1 | 0 | 1 |
1 | 1 | 2 |
1 | 2 | 3 |
2 | 0 | 5 |
2 | 5 | 6 |
2 | 6 | 7 |
2 | 7 | 8 |
3 | 0 | 3 |
3 | 3 | 4 |
3 | 4 | 5 |
I've tried sorting the data by ascending ID and descending entry and exit. Afterwards, I used the lag function. However, I was not getting the bottom table. Any thoughts?
Thank you!
John
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Very interesting question .
data have; input ID Entry Exit ; cards; 1 0 1 1 1 3 1 3 2 2 0 5 2 5 6 2 6 7 2 7 8 3 0 3 3 3 5 3 5 4 ; run; data temp(drop=entry exit); set have; value=entry;output; value=exit;output; run; proc sort data=temp nodupkey;by id value;run; data x; set temp; by id; if not first.id and not last.id then output; output; run; data x; set x; if mod(_n_,2)=1 then do;name='entry'; n+1;end; else name='exit'; run; proc transpose data=x out=want(drop=_: n); by id n; id name; var value; run;
Xia Keshan
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input ID Entry Exit;
datalines;
1 0 1
1 1 2
1 3 2
2 0 5
2 5 6
2 6 7
2 7 8
3 0 3
3 3 4
3 5 4
;
data want (drop=entry exit rename=(new_entry=entry new_exit=exit));
set have;
if entry>exit then new_entry=coalesce(exit,entry);
else new_entry=entry;
if entry>exit then new_exit=coalesce(entry,exit);
else new_exit=exit;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the quick reply. I made a mistake in my "incorrect" table. It should be:
ID | Entry | Exit |
1 | 0 | 1 |
1 | 1 | 3 |
1 | 3 | 2 |
2 | 0 | 5 |
2 | 5 | 6 |
2 | 6 | 7 |
2 | 7 | 8 |
3 | 0 | 3 |
3 | 3 | 5 |
3 | 5 | 4 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data want(drop=entry exit rename=(new_entry=entry cnt=exit));
set have;
by id;
if entry>exit then new_entry=coalesce(exit,entry);
else new_entry=entry;
if first.id then cnt=exit;
else cnt+1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your help thus far. However, my data is a little more general than what I presented. In other words, the values are not all integers that increment by 1. They could have fractions after them. Is there a way to swap two values within a column, in the same way coalesce "swaps" values?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
What about something like:
proc sql;
create table WANT as
select ID,
MIN(ENTRY,EXIT) as ENTRY,
MAX(ENTRY,EXIT) as EXIT
from HAVE;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For ID 1 what is "correct" .
If we simply sort ENTRY,EXIT then lag(exit) does not equal entry for obs 2 and 3.
How do you get from here?
ID | Entry | Exit |
1 | 0 | 1 |
1 | 1 | 3 |
1 | 3 | 2 |
to what I guess it still "correct" from your first post?
ID | Entry | Exit |
1 | 0 | 1 |
1 | 1 | 2 |
1 | 2 | 3 |
Do you sort columns first or check lag(exit) eq entry?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes for ID = 1 the bottom table is correct. To be more specific, here is an actual output of my table:
ID | Entry | Exit |
498 | 0.00000 | 0.12320 |
498 | 0.12320 | 0.25462 |
498 | 0.25462 | 1.78234 |
498 | 1.78234 | 3.20602 |
498 | 3.20602 | 3.15127 |
500 | 0.00000 | 2.00684 |
500 | 2.00684 | 2.16838 |
500 | 2.16838 | 2.19849 |
500 | 2.19849 | 3.04175 |
500 | 3.04175 | 2.54346 |
You'll notice that for both, ID 498 and 500, the second to last exit values should be flipped with the last exit value. Also for the last row of each ID, the entry and exit should be flipped.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Very interesting question .
data have; input ID Entry Exit ; cards; 1 0 1 1 1 3 1 3 2 2 0 5 2 5 6 2 6 7 2 7 8 3 0 3 3 3 5 3 5 4 ; run; data temp(drop=entry exit); set have; value=entry;output; value=exit;output; run; proc sort data=temp nodupkey;by id value;run; data x; set temp; by id; if not first.id and not last.id then output; output; run; data x; set x; if mod(_n_,2)=1 then do;name='entry'; n+1;end; else name='exit'; run; proc transpose data=x out=want(drop=_: n); by id n; id name; var value; run;
Xia Keshan
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you want to keep the original Entry value, because it looks like you dont need it, entry starts at zero and is then the previous exit value.
So you could just ignore it:
data have;;
input id entry exit;
cards;
498 0.00000 0.12320
498 0.12320 0.25462
498 0.25462 1.78234
498 1.78234 3.20602
498 3.20602 3.15127
500 0.00000 2.00684
500 2.00684 2.16838
500 2.16838 2.19849
500 2.19849 3.04175
500 3.04175 2.54346
;
run;
proc sort data=have;
by id exit;
run;
data want;
set have(drop=entry);
by id;
retain entry;
if first.id then
entry = 0;
output;
entry = exit;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would second @MichaelPearce, and using lag() to make it even less verbose:
data want;
set have (drop=entry);
by id;
Entry=ifn(first.id,0,lag(exit));
run;
Haikuo