BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stats10
Fluorite | Level 6

Hello,

 

This should be easy but I can't get my head around it. I have the below data (first 2 columns) and I need the 3rd column. Essentially, I need to group the TMP values and increase the counter each time TMP becomes 1 for each account, and then do the same for all accounts.

 

AccountTMPCol_Want
1234511
1234521
1234531
1234541
1234512
1234522
1234513
5678911
5678921
5678931
5678941
5678951
5678912

 

 

data file1;
input Account TMP;
datalines;
12345 1
12345 2
12345 3
12345 4
12345 1
12345 2
12345 1
56789 1
56789 2
56789 3
56789 4
56789 5
56789 1
;
run;

 

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data want;
	set file1;
	by account;
	prev_tmp=lag(tmp);
	if first.account then col_want=1;
	if prev_tmp>tmp then col_want+1;
	drop prev_tmp;
run;
--
Paige Miller

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26
data want;
	set file1;
	by account;
	prev_tmp=lag(tmp);
	if first.account then col_want=1;
	if prev_tmp>tmp then col_want+1;
	drop prev_tmp;
run;
--
Paige Miller
Stats10
Fluorite | Level 6

Thank you so much PaigeMiller! This is amazing! I spent so many hours on this 😛
I had to modify your code slightly as it didn't work with more accounts. For example, if the data I had provided was:

 

data file1;
input Account TMP;
datalines;
12345 1
12345 2
12345 3
12345 4
12345 1
12345 2
12345 1
56789 1
56789 2
56789 3
56789 4
56789 5
56789 1
12367 1
23456 1
23456 2
;
run;

proc sort data=file1;
by account;run;

 

your code would give 2 as the first value for the last group. So I amended it to do this:

 

data want;
set file1;
by account;
prev_tmp=lag(tmp);
if account~=lag(account) then prev_tmp=tmp;
if first.account then col_want=1;
if prev_tmp>tmp then col_want+1;
drop prev_tmp;
run;

 

You've helped me so much! Thanks again!

novinosrin
Tourmaline | Level 20

Hello @Stats10   It's cool that your amended code works. I really wonder whether you even need a lag. Well up to you. 

novinosrin
Tourmaline | Level 20

Hi @Stats10 

 

Using sum statement want+1

 


data have;
input Account	TMP	;*Col_Want;
cards;
12345	1	1
12345	2	1
12345	3	1
12345	4	1
12345	1	2
12345	2	2
12345	1	3
56789	1	1
56789	2	1
56789	3	1
56789	4	1
56789	5	1
56789	1	2
;

data want;
set have;
by account;
if first.account then want=1;
else if tmp=1 then want+1;
run;
Stats10
Fluorite | Level 6
Hi @novinosrin,

thanks for your code. Although it works for the example I provided initially, it doesn't seem to work for my real data nor for the amended data example I posted above. But it's very good and simple. Thank you!
novinosrin
Tourmaline | Level 20

Can you post the expected output for the amended sample plz

Stats10
Fluorite | Level 6

Apologies @novinosrin when I re-ran it for the amended example, your code worked:

 

AccountTMPcol_wantnovinosrin
12345111
12345211
12345311
12345411
12345122
12345222
12345133
12367111
23456111
23456211
56789111
56789211
56789311
56789411
56789511
56789122

 

However, for some reason it doesn't work for my real data. But it worked for the sample, so thank you very much!

novinosrin
Tourmaline | Level 20

Ok, I am just wondering whether your sample former or after is a solid representative of your real. It's not that I am trying to prove a point against your code by any means, I am just really keen on simple solutions if possible.

 

The probable reasons could be

1. Sort order: This assumes By account tmp

2. Tmp is assumed as it looks in the sample -continuous unit increment

3.  If the above 2 is true, the code is very straight forward

 

So if the sort order isn't what I see in your sample, that needs some adjustment by a proc sort or  perhaps a notsorted as deemed appropriate. 

 

Nevertheless, I get the vibe it's trivial in this thread/exercise to dig in when you have a working solution. 🙂

 

Stats10
Fluorite | Level 6

You are absolutely right @novinosrin ! Neither of the two conditions you mentioned is true for my real data. It's not sorted by TMP and TMP can be taking any integer value, sometimes the same one, for many observations. Thank you very much for your help! 🙂

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1253 views
  • 0 likes
  • 3 in conversation