BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
umeshgiri48
Obsidian | Level 7

I have a dataset in which I am trying to bring rows into columns from multiple rows eg:-

dateTimeAVD_IDdevice_NOdevice__No_idIDCheck_InCheck_outRouteStop_nameAmount
01/03/20195:06:00 PMB1910_EVD_0416645023442602803314916 :1022Canada Water bus station7.5
01/03/20195:38:00 PMB1910_EVD_0516645123443602803314916 :0122Tottenham Court Road station-4.5

 

now I want to create In_time, out_time on the basis of time and bring Amount, check_in and out in one row for eg:-

dateIn timeOut timeIDCheck_InCheck_outRouteIn Stop_nameOut Stop_nameIn AmountOut Amount
01/03/20195:06:00 PM5:38:00 PM1130602803314816 :1122Canada Water bus stationTottenham Court Road station7.5-4.5

 

I have tried using partition by, lag, row_no function but it not giving me the result.

Please help me on this.

1 ACCEPTED SOLUTION

Accepted Solutions
sustagens
Pyrite | Level 9

Try

data want (drop=AVD_ID device_NO device__No_id Route);
merge 
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount)) 
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
; by Date ID; run;

View solution in original post

8 REPLIES 8
umeshgiri48
Obsidian | Level 7
A person travel card
Kurt_Bremser
Super User

I will make some assumptions:

  • a "trip" can be identified by id and route
  • it begins with a check_in and ends with a check_out
  • there are always two observations (1 check_in and 1 check_out) for a trip
  • check_in happens before check_out
  • there cannot be overlapping time spans for two trips for a single id on the same route

are these assumptions correct?

 

For working with sequences of data, the data step is better suited than SQL.

umeshgiri48
Obsidian | Level 7
yes, all assumptions are correct
Kurt_Bremser
Super User

Then I suggest this data step solution:

data have;
infile datalines dlm='09'x;
input
  date :mmddyy10.
  Time :time11.
  AVD_ID :$12.
  device_NO
  device__No_id
  ID :$12.
  Check_In
  Check_out
  Route
  Stop_name :$30.
  Amount
;
dt = dhms(date,0,0,time);
format dt e8601dt19.;
datalines;
01/03/2019	5:06:00 PM	B1910_EVD_04	166450	23442	602803314916:	1	0	22	Canada Water bus station	7.5
01/03/2019	5:38:00 PM	B1910_EVD_05	166451	23443	602803314916:	0	1	22	Tottenham Court Road station	-4.5
;

proc sort data=have;
by id dt;
run;

data trips;
retain /* for column order also */
  id
  in_time
  out_time
  check_in
  check_out
  in_stop_name
  out_stop_name
  in_amount
  out_amount
;
format
  in_time
  out_time e8601dt19.
;
set have (rename=(
  check_in=_check_in
  amount=out_amount
  dt=out_time
  stop_name=out_stop_name
));
by id route notsorted;
if first.route
then do;
  in_time = out_time;
  check_in = _check_in;
  in_stop_name = out_stop_name;
  in_amount = out_amount;
end;
if last.route;
keep
  id
  in_time
  out_time
  check_in
  check_out
  in_stop_name
  out_stop_name
  in_amount
  out_amount
;
run;

By using datetimes instead of separate dates and times, this code will also deal correctly with trips crossing the day boundary.

sustagens
Pyrite | Level 9

Try

data want (drop=AVD_ID device_NO device__No_id Route);
merge 
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount)) 
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
; by Date ID; run;
umeshgiri48
Obsidian | Level 7
can we also do it by using proc sql
mkeintz
PROC Star

Yes, but if your data are already sorted by DATE/ID, the DATA step solution provided by @Kurt_Bremser will be MUCH faster, since unlike SQL, it won't be obligated to do a Cartesian comparison of all checkin ID's against all checkout ID's while attempting to find all matches.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 969 views
  • 2 likes
  • 4 in conversation