I have a dataset in which I am trying to bring rows into columns from multiple rows eg:-
date | Time | AVD_ID | device_NO | device__No_id | ID | Check_In | Check_out | Route | Stop_name | Amount |
01/03/2019 | 5:06:00 PM | B1910_EVD_04 | 166450 | 23442 | 602803314916 : | 1 | 0 | 22 | Canada Water bus station | 7.5 |
01/03/2019 | 5:38:00 PM | B1910_EVD_05 | 166451 | 23443 | 602803314916 : | 0 | 1 | 22 | Tottenham Court Road station | -4.5 |
now I want to create In_time, out_time on the basis of time and bring Amount, check_in and out in one row for eg:-
date | In time | Out time | ID | Check_In | Check_out | Route | In Stop_name | Out Stop_name | In Amount | Out Amount |
01/03/2019 | 5:06:00 PM | 5:38:00 PM | 1130602803314816 : | 1 | 1 | 22 | Canada Water bus station | Tottenham Court Road station | 7.5 | -4.5 |
I have tried using partition by, lag, row_no function but it not giving me the result.
Please help me on this.
Try
data want (drop=AVD_ID device_NO device__No_id Route);
merge
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount))
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
;
by Date ID;
run;
What is identified by ID? A single trip, or something like a person or travel card?
I will make some assumptions:
are these assumptions correct?
For working with sequences of data, the data step is better suited than SQL.
Then I suggest this data step solution:
data have;
infile datalines dlm='09'x;
input
date :mmddyy10.
Time :time11.
AVD_ID :$12.
device_NO
device__No_id
ID :$12.
Check_In
Check_out
Route
Stop_name :$30.
Amount
;
dt = dhms(date,0,0,time);
format dt e8601dt19.;
datalines;
01/03/2019 5:06:00 PM B1910_EVD_04 166450 23442 602803314916: 1 0 22 Canada Water bus station 7.5
01/03/2019 5:38:00 PM B1910_EVD_05 166451 23443 602803314916: 0 1 22 Tottenham Court Road station -4.5
;
proc sort data=have;
by id dt;
run;
data trips;
retain /* for column order also */
id
in_time
out_time
check_in
check_out
in_stop_name
out_stop_name
in_amount
out_amount
;
format
in_time
out_time e8601dt19.
;
set have (rename=(
check_in=_check_in
amount=out_amount
dt=out_time
stop_name=out_stop_name
));
by id route notsorted;
if first.route
then do;
in_time = out_time;
check_in = _check_in;
in_stop_name = out_stop_name;
in_amount = out_amount;
end;
if last.route;
keep
id
in_time
out_time
check_in
check_out
in_stop_name
out_stop_name
in_amount
out_amount
;
run;
By using datetimes instead of separate dates and times, this code will also deal correctly with trips crossing the day boundary.
Try
data want (drop=AVD_ID device_NO device__No_id Route);
merge
have (where=(Check_In=1) drop=Check_Out rename=(Time=In_Time Stop_Name=In_Stop_Name Amount=In_Amount))
have (where=(Check_Out=1) drop=Check_In rename=(Time=Out_Time Stop_Name=Out_Stop_Name Amount=Out_Amount))
;
by Date ID;
run;
Yes, but if your data are already sorted by DATE/ID, the DATA step solution provided by @Kurt_Bremser will be MUCH faster, since unlike SQL, it won't be obligated to do a Cartesian comparison of all checkin ID's against all checkout ID's while attempting to find all matches.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.