About pulpfiction123

pulpfiction123 · ‎12-09-2023

Hi Cristina, thank you for the response and the suggestion. The source dataset (before I transformed it) does have only 1 ID column indexed by different ID values and transaction dates/amounts etc as you would expect. I have indeed tried this approach before this by using the first.ID and last.ID method and looping through each transaction to compare them. However, my issue is that each ID has a different number of transactions, and I have to compare each transaction to all the preceding transactions. Eg: ID1 has 10 transactions, I need to compare transaction 1 to transactions 2-10, then transaction 2 to transactions 3-10, and so forth. I could not find a way to have a dynamic way to have this reflected in the LAG function for each loop. My other attempt was to loop through all the transactions regardless of ID, and ignore all loops with different IDs. But this took too long to run. Hope this clarifies things, and happy to hear if there are better ways of going about this!

pulpfiction123 · ‎12-09-2023

Hi Tom, thanks for your questions. Happy to clarify: 1) The data I provided is not the source data, the source data contains only one ID column with different ID values as you might expect. Each ID can have multiple transaction records with different dates and transaction values. The dataset I have provided in the post is what I got after transposing the data by Date, turning each ID into it's own column (ID1, ID2) with the values of those columns being the transaction values. The 'Counter' columns are all missing initially as I have just created them, there is one column per ID. The intention is to add +1 to each ID's 'Counter' column for each row based on the number of preceding records that have a different date. Eg: if person 1 has 10 transactions, I'll need to compare transaction 1 to transactions 2-10, transaction 2 to transactions 3-10 and so on - and add the number to that ID's counter column by transaction level. 2) The data was transposed because it was too difficult to loop through each ID and count how many transactions matched the criteria, as each ID can have a different number of transactions. My initial solution could not finish running for an hour on the full dataset (which contains a lot more IDs and records), whereas after transposing it ran in little time (albeit with inaccurate results). As the data is transposed by date, 3) The values for the first and second ID variables are the transaction values for each ID, and the amount is not as relevant as whether it is missing or not. If the value is populated, it means that this ID has a transaction of this amount on this date. If it is missing, then this ID does not have a transaction on this date. In the 2 sample cases I provided, ID1 and ID2 do not have any transactions on the same day, hence only one of each column is populated for each observation. ID1 has 4 transactions in total - on 1st March, 1st April and 2 on 1st May. ID2 has 6 transactions in total, 5 on the same day, and 1 on 1st Feb. Hence, in the expected results - ID1's Counter column has a value of 2 for the third record, as there are 2 preceding transactions (the dataset is sorted by date) for ID1 that have a different date. The fourth record should have a value of 2 as well for this reason, as the 3rd and 4th records have the same day and thus would not count. Similarly for the second ID2 counter column, the 5 at the end is because for ID2 there are 5 preceding transactions with a different date, hence why it is 5, and hence why the previous rows were missing even in the counter column. As to why the values are missing instead of 0, it's an oversight on my part - I should initialize the counter columns to be 0. Hope that clarifies it, and apologies if the initial post was not very clear!

pulpfiction123 · ‎12-08-2023

Hello everyone, I have a transposed dataset with the IDs as columns, with each ID having an additional 'counter' column, and a Date column. Here's a sample of my dataset: Data have; Input DATE2 :ddmmyy10. ID_1 ID_2 Counter_ID_1 Counter_ID_2; Format DATE2 ddmmyy10.; datalines; 01/03/2019 630 . . . 01/04/2019 90 . . . 01/05/2019 2 . . . 01/05/2019 112 . . . 01/01/2023 . 20 . . 01/01/2023 . 100 . . 01/01/2023 . 2 . . 01/01/2023 . 10 . . 01/01/2023 . 420 . . 01/02/2023 . 98 . . ; Run; I want to loop through every row - for each record for every ID, add +1 to it's associated counter column for each previous record (sorted by date) that has a different date compared to it. Eg: If a record has 2 preceding records for the same ID that have a different date, and 1 with the same date, the value for the counter for that record should be 2. Here's what it should look like: Data want; Input DATE2 :ddmmyy10. ID_1 ID_2 Counter_ID_1 Counter_ID_2; Format DATE2 ddmmyy10.; datalines; 01/03/2019 630 . . . 01/04/2019 90 . 1 . 01/05/2019 2 . 2 . 01/05/2019 112 . 2 . 01/01/2023 . 20 . . 01/01/2023 . 100 . . 01/01/2023 . 2 . . 01/01/2023 . 10 . . 01/01/2023 . 420 . . 01/02/2023 . 98 . 5 ; Run; I have the below code, which creates 2 arrays of the same dimensions (1 storing each ID, and 1 storing the associated counter columns). My code supposedly loops through each ID, and compares each row to all the previous rows using the lag function, adding +1 to the counter column if it's the same ID and a different date. However, the result is not entirely accurate and I am getting wrong results in cases where there are both preceding records with same and different dates. Below is a sample of my code: Data _NULL_; Call symput (‘nrows’, 10); Run; data test; set have; array IDS{*} ID_:; array IDS2{*} Counter_:; length tracker $100; %macro mymacro; Do i=1 to dim(accounts); %do x=1 %to &nrows.; /*Only getting observations for the same ID*/ If IDS(i) ne . AND lag&x.(IDS(i)) ne. AND DATE2 ne lag&x.(DATE2) Then do; IDS2(i)+1; End; %end; End; %mend; %mymacro; Drop i; Run; Can anyone tell me where I'm going wrong, or suggest an alternative way to do this? Thanks!

Online Status	Offline
Date Last Visited	‎12-09-2023 03:24 PM

Re: Loop through all rows and create a counter based on conditions

Re: Loop through all rows and create a counter based on conditions

Loop through all rows and create a counter based on conditions

Re: Loop through all rows and create a counter based on conditions

Re: Loop through all rows and create a counter based on conditions

Loop through all rows and create a counter based on conditions