data have;
infile cards expandtabs;
input group id$ date$ time$ class;
cards;
0 A 20/7/2014 10:00:00 1
0 L 20/7/2014 10:30:12 2
1 R 20/7/2014 11:12:05 3
1 A 20/7/2014 11:15:07 1
1 L 20/7/2014 11:50:30 2
2 F 20/7/2014 12:02:06 1
2 L 20/7/2014 12:04:50 1
3 A 20/7/2014 13:41:19 1
3 B 20/7/2014 13:41:19 3
3 C 20/7/2014 13:41:19 2
4 B 20/7/2014 14:02:05 3
;
run;
Hi, i would like to select the ID that appears in two groups consequently with respect to the class as well.
look at ID if groups>= 2 and the class = class then
calculate the time duration.
The expected output.
id | date | duration | class |
A | 20/7/2014 | 1:15:07 | 1 |
L | 20/7/2014 | 1:20:18 | 2 |
B | 20/7/2014 | 0:20:46 | 3 |
Thank you.
Not sure I really follow you, just sort the data:
proc sort data=have out=want; by id group; run; data want; set want; if group > 0 and id=lag(id) then do; duration=time-lag(time); output; end; run;
Something like that, can't test as you have not provided test data in the form of a datastep!
@ihlayyel wrote:
Iam so sorry for that. I have modified the data
Do so again. You should know by now what a data step is and how data step code looks. The intention is that with a simple copy/paste and submit of your code the dataset is created. No ERRORs, no WARNINGs.
PS in response to this:
group id date time class
0 A 20/7/2014 10:00:00 1
0 L 20/7/2014 10:30:12 2
1 R 20/7/2014 11:12:05 3
1 A 20/7/2014 11:15:07 1
1 L 20/7/2014 11:50:30 2
2 F 20/7/2014 12:02:06 1
2 L 20/7/2014 12:04:50 1
3 A 20/7/2014 13:41:19 1
3 B 20/7/2014 13:41:19 3
3 C 20/7/2014 13:41:19 2
4 B 20/7/2014 14:02:05 3
which is not SAS code at all.
When ID='A' has more than 2 observations, do you want duration to be the time difference between the current A and the most recent prior A? If so, then you want what I would term a "dynamic lag" - ("dynamic" because the two A records are separated by a non-constant number of intermediate records).
Assuming your dataset is sorted by TIME (it doesn't matter whether it is sorted by ID):
data want (drop=_:);
set have;
format _last_time duration time8.0;
if _n_=1 then do;
declare hash h();
h.definekey('id');
h.definedata('_last_time');
h.definedone();
end;
_rc=h.find();
if _rc=0 then duration= time-_last_time ;
_rc=h.replace(key:id,data:time);
run;
Now, if DATE matters also (i.e. DURATION can cross date boundaries), then you'll have to add a variable called _LAST_DATE to the format statement, the DEFINEDATA method, and to the REPLACE method. You'll also have to calculate
DURATION=86400*(date-_last_date) + time - _last_time;
And you'll also probably want to change the format of DURATION to, say TIME9.0 (accommodates up to 999 hours, 59 mins, 59 secs) or HHMM6.0 (up to 999 hours, 59 mins).
This program depends on creating a hash object H, which allows values to be stored keyed on ID, retrieved, and persists across observations.
thank you so much. That is awesome. but there is one thing.
the code keeps considering the last time value. if the ID appears in a different group it will measure the time of new ID with the last time.
the output of your code.
group id date time duraion
1 A 20/07/2014 15:07.0 1 1:15:07
1 L 20/07/2014 50:30.0 2 1:20:18
2 L 20/07/2014 04:50.0 1 0:14:20
3 A 20/07/2014 41:19.0 1 2:26:12
4 B 20/07/2014 02:05.0 3 0:20:46
How to say that if another ID is found start match again. and the number of the count of groups is >0
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.