BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
umeshgiri48
Obsidian | Level 7

I have a dataset in which I am trying to bring rows into columns from multiple rows eg:-

dateTimeAVD_IDdevice_NOdevice__No_idIDCheck_InCheck_outRouteStop_nameAmount
01/03/20195:06:00 PMB1910_EVD_0416645023442602803314916 :1022Canada Water bus station7.5
01/03/20195:38:00 PMB1910_EVD_0516645123443602803314916 :0122Tottenham Court Road station-4.5

 

now I want to create In_time, out_time on the basis of time and bring Amount, check_in and out in one row for eg:-

dateIn timeOut timeIDCheck_InCheck_outRouteIn Stop_nameOut Stop_nameIn AmountOut Amount
01/03/20195:06:00 PM5:38:00 PM1130602803314816 :1122Canada Water bus stationTottenham Court Road station7.5-4.5

 

I have tried using partition by, lag, row_no function but it not giving me the result.

Please help me on this.

1 ACCEPTED SOLUTION

Accepted Solutions
sustagens
Pyrite | Level 9

Try

data want (drop=AVD_ID device_NO device__No_id Route);
merge 
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount)) 
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
; by Date ID; run;

View solution in original post

8 REPLIES 8
umeshgiri48
Obsidian | Level 7
A person travel card
Kurt_Bremser
Super User

I will make some assumptions:

  • a "trip" can be identified by id and route
  • it begins with a check_in and ends with a check_out
  • there are always two observations (1 check_in and 1 check_out) for a trip
  • check_in happens before check_out
  • there cannot be overlapping time spans for two trips for a single id on the same route

are these assumptions correct?

 

For working with sequences of data, the data step is better suited than SQL.

umeshgiri48
Obsidian | Level 7
yes, all assumptions are correct
Kurt_Bremser
Super User

Then I suggest this data step solution:

data have;
infile datalines dlm='09'x;
input
  date :mmddyy10.
  Time :time11.
  AVD_ID :$12.
  device_NO
  device__No_id
  ID :$12.
  Check_In
  Check_out
  Route
  Stop_name :$30.
  Amount
;
dt = dhms(date,0,0,time);
format dt e8601dt19.;
datalines;
01/03/2019	5:06:00 PM	B1910_EVD_04	166450	23442	602803314916:	1	0	22	Canada Water bus station	7.5
01/03/2019	5:38:00 PM	B1910_EVD_05	166451	23443	602803314916:	0	1	22	Tottenham Court Road station	-4.5
;

proc sort data=have;
by id dt;
run;

data trips;
retain /* for column order also */
  id
  in_time
  out_time
  check_in
  check_out
  in_stop_name
  out_stop_name
  in_amount
  out_amount
;
format
  in_time
  out_time e8601dt19.
;
set have (rename=(
  check_in=_check_in
  amount=out_amount
  dt=out_time
  stop_name=out_stop_name
));
by id route notsorted;
if first.route
then do;
  in_time = out_time;
  check_in = _check_in;
  in_stop_name = out_stop_name;
  in_amount = out_amount;
end;
if last.route;
keep
  id
  in_time
  out_time
  check_in
  check_out
  in_stop_name
  out_stop_name
  in_amount
  out_amount
;
run;

By using datetimes instead of separate dates and times, this code will also deal correctly with trips crossing the day boundary.

sustagens
Pyrite | Level 9

Try

data want (drop=AVD_ID device_NO device__No_id Route);
merge 
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount)) 
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
; by Date ID; run;
umeshgiri48
Obsidian | Level 7
can we also do it by using proc sql
mkeintz
PROC Star

Yes, but if your data are already sorted by DATE/ID, the DATA step solution provided by @Kurt_Bremser will be MUCH faster, since unlike SQL, it won't be obligated to do a Cartesian comparison of all checkin ID's against all checkout ID's while attempting to find all matches.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1384 views
  • 2 likes
  • 4 in conversation