Hello, I have a dataset that has 4 variables:
OBS NAME BOARDING DESTINATION DATE
1 CHITRA DEHRADUN MUMBAI 04OCT2018
2 CHITRA MUMBAI HYDERABAD 01NOV2018
3 CHITRA HYDERABAD DELHI 30NOV2018
4 NANCY DELHI HYDERABAD 01OCT2018
5 NANCY HYDERABAD BANGALORE 02NOV2018
6 NANCY BANGALORE MUMBAI 30NOV2018
7 NANCY MUMBAI DELHI 31DEC2018
I want output to get split in two datasets(RETURN and NORETURN) .
RETURN dataset should have those observations whose Boarding and Destination stations are same like from above dataset Nancy started her journey from Delhi and returned to Delhi.
NAME BOARDING DESTINATION
NANCY DELHI DELHI
NORETURN dataset should have those observations whose Boarding and Destination stations are not same like from above dataset Chitra started her Journey from Dehradun but she ended up in Delhi.
NAME BOARDING DESTINATION
CHITRA DEHRADUN DELHI
I tried below code but not getting the output as I want:
DATA TEST;
INPUT NAME $ BOARDING $ DESTINATION $ DATE$;
CARDS;
NANCY DELHI HYDERABAD 01OCT2018
NANCY HYDERABAD BANGALORE 2NOV2018
NANCY BANGALORE MUMBAI 30NOV2018
NANCY MUMBAI DELHI 31DEC2018
CHITRA DEHRADUN MUMBAI 04OCT2018
CHITRA MUMBAI HYDERABAD 01NOV2018
CHITRA HYDERABAD DELHI 30NOV2018
RUN;
PROC SORT DATA = TEST; BY NAME;
DATA RETURN NORETURN;
SET TEST;
BY NAME;
IF FIRST.NAME THEN DO;
FIRST_BOARDING = BOARDING;
END;
IF LAST.NAME THEN DO;
LAST_DESTINATION = DESTINATION;
END;
IF FIRST_BOARDING = LAST_DESTINATION THEN OUTPUT RETURN;
ELSE OUTPUT NORETURN;
RUN;
Can anyone please suggest how can I get correct output?
Thank you.
Please avoid coding in uppercase, its like being shouted at. You could probably use hash table, but I would just go with simple two step approach, output last.record, merge that back on:
data last; set test; by name; if last.name then output; run; data return noreturn; merge test last (keep=name destination rename=(destination=dest)); by name; retain flag; flag=ifn(first.name and boarding=dest,1,0); if flag=1 then output; run;
Something like that (not tested).
You can do it with a single data step like this:
data return noreturn;
do until (last.name);
set test;
by name notsorted;
if first.name then
Start=Boarding;
end;
end=destination;
drop boarding destination;
rename start=boarding end=destination;
if start=end then
output return;
else
output noreturn;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.