BookmarkSubscribeRSS Feed
Divya8
Calcite | Level 5
I need to consider first occurrence of 7400 in a group and the last occurrence of 7470/4320. I have altered the code as below

Data want;
set have;
by Group transaction notsorted;
where transaction in (7400,7470,4320);

if (last.transaction and transaction=7400 and not last.group) or (first.transaction and transaction=7470 or transaction = 4320 and not first.group);

The code is working in for case A,B,C, E but not D

Input
Group sequence transaction
A 100 1200 jan10
A 101 7400 jan10
A. 102 1000 jan10
A 103 7470 jan10
B 201 7400 jan10
B 202 4044 jan10
B. 203. 4600 jan10
B. 204. 7470 jan10
B. 205. 7400 jan10
B. 206. 7470 jan10
C. 301. 7400 jan10
C. 302. 1000 jan10
C. 303. 7400 jan10
C. 304. 7470 jan10
D 500 4060 jan10
D 401 7400 jan10
D 403 4320 jan10
D 404 7400 jan10
D 405 7415 jan10
D 406 7420 jan10
D 407 7400 jan10
D 408 7470 jan10
E 500 7400 jan10
E 501 1000 jan10
E 502 4320 jan10
E 503 7400 jan11
E 504 7470 jan11


Output should look like
Group sequence transaction
A. 101. 7400 jan10
A. 103. 7470 jan10
B. 201. 7400 jan10
B. 204. 7470 jan10
B. 205. 7400 jan10
B. 206. 7470 jan10
C. 303. 7400 jan10
C. 304. 7470 jan10
D 401 7400 jan10
D 408 7470 jan10
E 500 7400 jan10
E 502 4320 jan10
E 503 7400 jan11
E 504 7470 jan11
8 REPLIES 8
ChrisNZ
Tourmaline | Level 20

1. From now on, please post you data as a data step (and run the pasted code to ensure it is valid) like this:

data HAVE;
 input GROUP $ SEQUENCE $ TRANSACTION DT $ ;
cards;
A 100 1200 jan10
A 101 7400 jan10
A 102 1000 jan10
A 103 7470 jan10
B 201 7400 jan10
B 202 4044 jan10
B 203 4600 jan10
B 204 7470 jan10
B 205 7400 jan10
B 206 7470 jan10
C 301 7400 jan10
C 302 1000 jan10
C 303 7400 jan10
C 304 7470 jan10
D 500 4060 jan10
D 401 7400 jan10
D 403 4320 jan10
D 404 7400 jan10
D 405 7415 jan10
D 406 7420 jan10
D 407 7400 jan10
D 408 7470 jan10
E 500 7400 jan10
E 501 1000 jan10
E 502 4320 jan10
E 503 7400 jan11
E 504 7470 jan11
run;

2. Your Boolean clause in the IF test is unclear because of the OR operator. Do you mean this?

data WANT;
  set HAVE;
  by GROUP TRANSACTION notsorted;
  where TRANSACTION in (7400,7470,4320);
  if (last.TRANSACTION and TRANSACTION=7400 and not last.GROUP) 
  or (first.TRANSACTION and TRANSACTION in(7470,4320) and not first.GROUP);
run;

3. The test validates the values

D 407 7400 jan10

which you state you do not want.

Why should these values be rejected? In other words, what is the condition to reject these values?

 

 

 

 

Divya8
Calcite | Level 5
2. The requirement is to get the first occurrence of 7400 transaction in a group and the last occurrence of the 7479/4320 on the same date.

3. As per the above requirement I need to consider only the below values for D group.

D 401 7400 jan10
D 408 7470 jan10

D 500 4060 jan10
D 401 7400 jan10
D 403 4320 jan10
D 404 7400 jan10
D 405 7415 jan10
D 406 7420 jan10
D 407 7400 jan10
D 408 7470 jan10
Divya8
Calcite | Level 5
I have restrictions to copy paste the code. It’s was a typo error. Yes I mean the same logic that you have mentioned in the point 2
Kurt_Bremser
Super User

Your expected output does not match your logic. Per your logic, you can only have a maximum of two observations per group.

Code according to your logic:

data have;
input group :$1. sequence transaction;
datalines;
A 100 1200
A 101 7400
A 102 1000
A 103 7470
B 201 7400
B 202 4044
B 203 4600
B 204 7470
B 205 7400
B 206 7470
C 301 7400
C 302 1000
C 303 7400
C 304 7470
D 500 4060
D 401 7400
D 403 4320
D 404 7400
D 405 7415
D 406 7420
D 407 7400
D 408 7470
E 500 7400
E 501 1000
E 502 4320
E 503 7400
E 504 7470
;

data want;
firstflag = 0;
lastcount = 0;
counter = 0;
do until (last.group);
  set have;
  by group;
  counter + 1;
  if not firstflag and transaction = '7400'
  then do;
    output;
    firstflag = 1;
  end;
  if transaction in (7470,4320) then lastcount = counter;
end;
counter = 0;
do until (last.group);
  set have;
  by group;
  counter + 1;
  if counter = lastcount then output;
end;
drop firstflag lastcount counter;
run;

proc print data=want noobs;
run;

Result:

group    sequence    transaction

  A         101          7400   
  A         103          7470   
  B         201          7400   
  B         206          7470   
  C         301          7400   
  C         304          7470   
  D         401          7400   
  D         408          7470   
  E         500          7400   
  E         504          7470   
Divya8
Calcite | Level 5
Thank you for sharing the code. I am getting the expected result for Group A, B,C and D now. For group E, expecting below result. I need to consider transaction date also. Group E having jan10 and jan11 as transaction date. Based on the date also the 7400 and 4320/7470 transaction matching should happen

E 500 7400 jan10
E 502 4320 jan10
E 503 7400 jan11
E 504 7470 jan11
Kurt_Bremser
Super User

So the transaction date (which was not recognized as its own variable in your initial post, where the data step read only three columns) does play a role.

This only means we have to use two by variables in the crucial step:

data have;
input group :$1. sequence transaction transaction_date yymmdd10.;
format transaction_date yymmddd10.;
datalines;
A 100 1200 2019-01-10
A 101 7400 2019-01-10
A 102 1000 2019-01-10
A 103 7470 2019-01-10
B 201 7400 2019-01-10
B 202 4044 2019-01-10
B 203 4600 2019-01-10
B 204 7470 2019-01-10
B 205 7400 2019-01-10
B 206 7470 2019-01-10
C 301 7400 2019-01-10
C 302 1000 2019-01-10
C 303 7400 2019-01-10
C 304 7470 2019-01-10
D 500 4060 2019-01-10
D 401 7400 2019-01-10
D 403 4320 2019-01-10
D 404 7400 2019-01-10
D 405 7415 2019-01-10
D 406 7420 2019-01-10
D 407 7400 2019-01-10
D 408 7470 2019-01-10
E 500 7400 2019-01-10
E 501 1000 2019-01-10
E 502 4320 2019-01-10
E 503 7400 2019-01-11
E 504 7470 2019-01-11
;

data want;
firstflag = 0;
lastcount = 0;
counter = 0;
do until (last.transaction_date);
  set have;
  by group transaction_date;
  counter + 1;
  if not firstflag and transaction = '7400'
  then do;
    output;
    firstflag = 1;
  end;
  if transaction in (7470,4320) then lastcount = counter;
end;
counter = 0;
do until (last.transaction_date);
  set have;
  by group transaction_date;
  counter + 1;
  if counter = lastcount then output;
end;
drop firstflag lastcount counter;
run;

proc print data=want noobs;
run;

Result:

                                    transaction_
group    sequence    transaction        date

  A         101          7400        2019-01-10 
  A         103          7470        2019-01-10 
  B         201          7400        2019-01-10 
  B         206          7470        2019-01-10 
  C         301          7400        2019-01-10 
  C         304          7470        2019-01-10 
  D         401          7400        2019-01-10 
  D         408          7470        2019-01-10 
  E         500          7400        2019-01-10 
  E         502          4320        2019-01-10 
  E         503          7400        2019-01-11 
  E         504          7470        2019-01-11 

 

Divya8
Calcite | Level 5
Thank you so much. The code is working😀😀 I ran a testing with actual data that is given in the project. I found one more scenario where group not having matching 7470/4320 transaction for 7400 transaction. I should not consider this group. For eg
Group sequence Transaction date
F 701 7400 2019-01-10
F 702 1000 2019-01-10
F 703. 3240 2019-01-10
Divya8
Calcite | Level 5
I tried the below option. After extracting the data using the above suggested method, I used the matching check the one I posted it initially. This gives the result

data WANT;
set HAVE;
by GROUP TRANSACTION notsorted;
where TRANSACTION in (7400,7470,4320);
if (last.TRANSACTION and TRANSACTION=7400 and not last.GROUP)
or (first.TRANSACTION and TRANSACTION in(7470,4320) and not first.GROUP);
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 823 views
  • 0 likes
  • 3 in conversation