I have a data like this:
ID | Application_No | Version | Action | Date |
---|---|---|---|---|
1 | ABCDEFX2 | 1 | Submit For Approval | 01 July 2014 |
2 | ABCDEFX2 | 1 | Return | 02 July 2014 |
3 | ABCDEFX2 | 2 | Submit For Approval | 01 July 2014 |
4 | ABCDEFX2 | 2 | Return | 02 July 2014 |
5 | ABCDEFX2 | 2 | Submit For Approval | 05 July 2014 |
6 | ABCDEFX2 | 2 | Approve | 06 July 2014 |
This table is a transaction trail. The system, on new version, will copy over the previous trail.
For example, on row no. 3 a new version of the same application is created. And Row1 & 2 is copied as Row3 & 4 and trail started over.
How do I delete row no 3 and 4?
Row 3 and 4 is copy of Row 1 and 2 respectively.
A friend suggested using SELECT DISTINCT but have to remove column VERSION.
Yes, it may then be a bit more tricky than a sort/nodupkey as you need to check previous versions. The following should do that:
data have;
length application_no action date $50;
infile datalines dlm=",";
input ID Application_No $ Version Action $ Date $;
datalines;
1,ABCDEFX2,1,Submit For Approval,01July2014
2,ABCDEFX2,1,Return,02July2014
3,ABCDEFX2,2,Submit For Approval,01July2014
4,ABCDEFX2,2,Return,02July2014
5,ABCDEFX2,2,Submit For Approval,05July2014
6,ABCDEFX2,2,Approve,06July2014
run;
proc sql;
create table WANT as
select BASE.*
from WORK.HAVE BASE
where not exists(select distinct THIS.APPLICATION_NO
from WORK.HAVE THIS
where THIS.APPLICATION_NO=BASE.APPLICATION_NO
and THIS.ACTION=BASE.ACTION
and THIS.DATE=BASE.DATE
and THIS.VERSION < BASE.VERSION);
quit;
The key I see is application_no, Action and Date.
So PROC SORT NODUPKEY would probable do the work.
The key is Application_No and Version.
Therefore I need to retain row 1 and 2 for version 1, and row 5 and 6 for version 2.
Yes, it may then be a bit more tricky than a sort/nodupkey as you need to check previous versions. The following should do that:
data have;
length application_no action date $50;
infile datalines dlm=",";
input ID Application_No $ Version Action $ Date $;
datalines;
1,ABCDEFX2,1,Submit For Approval,01July2014
2,ABCDEFX2,1,Return,02July2014
3,ABCDEFX2,2,Submit For Approval,01July2014
4,ABCDEFX2,2,Return,02July2014
5,ABCDEFX2,2,Submit For Approval,05July2014
6,ABCDEFX2,2,Approve,06July2014
run;
proc sql;
create table WANT as
select BASE.*
from WORK.HAVE BASE
where not exists(select distinct THIS.APPLICATION_NO
from WORK.HAVE THIS
where THIS.APPLICATION_NO=BASE.APPLICATION_NO
and THIS.ACTION=BASE.ACTION
and THIS.DATE=BASE.DATE
and THIS.VERSION < BASE.VERSION);
quit;
Thanks, this solution worked for me.
Hello,
Maybe this:
data have;
infile datalines truncover;
input ID Application_No $ Version Action & :$22. Date $12.;
datalines;
1 ABCDEFX2 1 Submit For Approval 01 July 2014
2 ABCDEFX2 1 Return 02 July 2014
3 ABCDEFX2 2 Submit For Approval 01 July 2014
4 ABCDEFX2 2 Return 02 July 2014
5 ABCDEFX2 2 Submit For Approval 05 July 2014
6 ABCDEFX2 2 Approve 06 July 2014
7 ABCDEFX2 3 Submit For Approval 05 July 2014
8 ABCDEFX2 3 Approve 06 July 2014
9 ABCDEFX2 3 Approve 05 July 2014
10 ABCDEFX2 3 Financed 06 July 2014
;
proc sort data=have;
by Application_No date Action version;
run;
data want;
set have;
by Application_No date Action ;
if first.action;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.