Hello,
This should be easy but I can't get my head around it. I have the below data (first 2 columns) and I need the 3rd column. Essentially, I need to group the TMP values and increase the counter each time TMP becomes 1 for each account, and then do the same for all accounts.
Account | TMP | Col_Want |
12345 | 1 | 1 |
12345 | 2 | 1 |
12345 | 3 | 1 |
12345 | 4 | 1 |
12345 | 1 | 2 |
12345 | 2 | 2 |
12345 | 1 | 3 |
56789 | 1 | 1 |
56789 | 2 | 1 |
56789 | 3 | 1 |
56789 | 4 | 1 |
56789 | 5 | 1 |
56789 | 1 | 2 |
data file1;
input Account TMP;
datalines;
12345 1
12345 2
12345 3
12345 4
12345 1
12345 2
12345 1
56789 1
56789 2
56789 3
56789 4
56789 5
56789 1
;
run;
Thanks!
data want;
set file1;
by account;
prev_tmp=lag(tmp);
if first.account then col_want=1;
if prev_tmp>tmp then col_want+1;
drop prev_tmp;
run;
data want;
set file1;
by account;
prev_tmp=lag(tmp);
if first.account then col_want=1;
if prev_tmp>tmp then col_want+1;
drop prev_tmp;
run;
Thank you so much PaigeMiller! This is amazing! I spent so many hours on this 😛
I had to modify your code slightly as it didn't work with more accounts. For example, if the data I had provided was:
data file1;
input Account TMP;
datalines;
12345 1
12345 2
12345 3
12345 4
12345 1
12345 2
12345 1
56789 1
56789 2
56789 3
56789 4
56789 5
56789 1
12367 1
23456 1
23456 2
;
run;
proc sort data=file1;
by account;run;
your code would give 2 as the first value for the last group. So I amended it to do this:
data want;
set file1;
by account;
prev_tmp=lag(tmp);
if account~=lag(account) then prev_tmp=tmp;
if first.account then col_want=1;
if prev_tmp>tmp then col_want+1;
drop prev_tmp;
run;
You've helped me so much! Thanks again!
Hello @Stats10 It's cool that your amended code works. I really wonder whether you even need a lag. Well up to you.
Hi @Stats10
Using sum statement want+1
data have;
input Account TMP ;*Col_Want;
cards;
12345 1 1
12345 2 1
12345 3 1
12345 4 1
12345 1 2
12345 2 2
12345 1 3
56789 1 1
56789 2 1
56789 3 1
56789 4 1
56789 5 1
56789 1 2
;
data want;
set have;
by account;
if first.account then want=1;
else if tmp=1 then want+1;
run;
Can you post the expected output for the amended sample plz
Apologies @novinosrin when I re-ran it for the amended example, your code worked:
Account | TMP | col_want | novinosrin |
12345 | 1 | 1 | 1 |
12345 | 2 | 1 | 1 |
12345 | 3 | 1 | 1 |
12345 | 4 | 1 | 1 |
12345 | 1 | 2 | 2 |
12345 | 2 | 2 | 2 |
12345 | 1 | 3 | 3 |
12367 | 1 | 1 | 1 |
23456 | 1 | 1 | 1 |
23456 | 2 | 1 | 1 |
56789 | 1 | 1 | 1 |
56789 | 2 | 1 | 1 |
56789 | 3 | 1 | 1 |
56789 | 4 | 1 | 1 |
56789 | 5 | 1 | 1 |
56789 | 1 | 2 | 2 |
However, for some reason it doesn't work for my real data. But it worked for the sample, so thank you very much!
Ok, I am just wondering whether your sample former or after is a solid representative of your real. It's not that I am trying to prove a point against your code by any means, I am just really keen on simple solutions if possible.
The probable reasons could be
1. Sort order: This assumes By account tmp
2. Tmp is assumed as it looks in the sample -continuous unit increment
3. If the above 2 is true, the code is very straight forward
So if the sort order isn't what I see in your sample, that needs some adjustment by a proc sort or perhaps a notsorted as deemed appropriate.
Nevertheless, I get the vibe it's trivial in this thread/exercise to dig in when you have a working solution. 🙂
You are absolutely right @novinosrin ! Neither of the two conditions you mentioned is true for my real data. It's not sorted by TMP and TMP can be taking any integer value, sometimes the same one, for many observations. Thank you very much for your help! 🙂
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.