Re: Row Labels Grouped By....

paulrockliffe · Posted 07-17-2017 08:37 AM

I've got a data table which consists of sequental returns from various sources. I want to add a label to the table that identifies where each row sits in the sequence of returns from it's source. For example, I want to label the latest return for each source as "1", the one before that as "2", etc.

I'm using Enterprise Guide, so something that works in the Query Builder would be great.

I'm running a test on the data that compares the latest figure with the previous figures, so ned to be able to know where in the sequence each sits.

Thanks

ChrisBrooks · Posted 07-17-2017 07:35 PM

Could you supply some sample data in the form of a data step with a cards statement so we can give you a better solution?

paulrockliffe · Posted 07-18-2017 04:59 AM

Hello Chris,

I don't have data to hand, but it's along these lines:

Old Fruit:

Apples 2016 £1.50

Apples 2015 £3

Pears 2015 £1.30

Pears 2014 £1.20

New Fruit:

Apples 2017 £1.10

Pears 2016 £1.90

New fruit is the data that's arrived since the process flow was last run and I have this separated out by creating a static copy of the data as it was last time the process was run and then comparing the two. I want to be able to highlight fruit where the price has doubled since the last data, or where it's fallen since two data points ago for example. I can do all of that, except the bit that labels each row as follows:

Old Fruit:

Apples 2016 £1.50 1

Apples 2015 £3 2

Pears 2015 £1.30 1

Pears 2014 £1.20 2

New Fruit:

Apples 2017 £1.10

Pears 2016 £1.90

In the simplified version of the data above you could do something that counts back from the Year in increments, but this doesn't work for the real data as the gaps in dates aren't consistent, what I need is to be able to compare all of the results for Apples and label them 1 to x in date order.

Does that make more sense?

Thanks

Paul.

ChrisBrooks · Posted 07-18-2017 06:22 AM

I think something like this might help - in a code node try

data have;
	infile datalines dlm=",";
	length fruit $6;
	input fruit $ year cost;
	datalines;
Apples,2016,1.5
Apples,2015,3
Pears,2015,1.3
Pears,2014,1.2
Pears,2000,1.1
;
quit;

proc sort data=have out=have_sorted;
by fruit descending year;
quit;



data want;
	set have_sorted;
	by fruit;
	year_rank=(lag(year)-year)+1;
	if first.fruit then year_rank=1;
run;

Note that in the data step you need to calculate the year_rank first and then reset it if it's the first observation in the By group - you shouldn't use lag inside a conditional statement or you get incorrect results

Row Labels Grouped By....