BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gibsonsg
Calcite | Level 5

Hi everyone.

 

I have a problem about vertical summation by some conditions.

 

Main dataset is below;

 

 

ID

COHORT

ALLOCATE

USAGE

1

1001

201601

15.049

6.766

2

1001

201602

13.507

5.088

3

1001

201603

11.872

9.352

4

1001

201604

13.791

7.404

5

1001

201605

13.618

8.866

6

1001

201606

16.906

8.897

7

1001

201607

13.706

6.988

8

1001

201608

11.543

9.294

9

1001

201609

12.457

7.308

10

1001

201610

14.250

8.934

11

1001

201611

11.148

7.726

12

1001

201612

19.648

6.331

 

Conditions;

  • ID1 = ID6 and
  • Allocate1 > 0 and
  • (Usage1 + Usage2 + Usage3 + Usage4 + Usage5 + Usage6) > 0

If these conditions are met then 1, else 0 in FINAL column.

 

The table I want to create is below;

 

 

ID

COHORT

ALLOCATE

USAGE

FINAL

1

1001

201601

15.049

6.766

1

2

1001

201602

13.507

5.088

1

3

1001

201603

11.872

9.352

1

4

1001

201604

13.791

7.404

1

5

1001

201605

0

8.866

0

6

1001

201606

16.906

8.897

1

7

1001

201607

13.706

0

0

8

1001

201608

11.543

0

0

9

1001

201609

12.457

0

0

10

1001

201610

14.250

0

0

11

1001

201611

11.148

0

0

12

1001

201612

19.648

0

0

 

How can I do that in datastep procedure?

 

Thanks a lot in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Instead of looking forward to a late row, you can sort the data by descending cohort (seems to be a date of yyyymm format)

then use lag function to peak at a previous rows (lag1 or lag2 or lag3 ...).

Try it and post your code in case of any issue.

View solution in original post

5 REPLIES 5
hashman
Ammonite | Level 13

@gibsonsg: I have no idea how, given the conditions you describe and your sample input, you arrive at zeros anywhere in the sample output. Before you logic can be translated into a DATA step (or any other program), it must be clearly understood. I admit that I totally fail to understand it.

 

Paul D.

gibsonsg
Calcite | Level 5

Hi hasman,

You are right, I have to correct the conditions. I want to check the ID with the following sixth ID.

Thanks for warning.

 

For first row;

ID1 = ID6 and

Allocate1 > 0 and

(Usage1 + Usage2 + Usage3 + Usage4 + Usage5 + Usage6) > 0

 

For second row;

ID2 = ID7 and

Allocate2 > 0 and

(Usage2 + Usage3 + Usage4 + Usage5 + Usage6 + Usage7) > 0

 

For thirth row;

ID3 = ID8 and

Allocate3 > 0 and

(Usage3 + Usage4 + Usage5 + Usage6 + Usage7 + Usage8) > 0

 

and so on.

hashman
Ammonite | Level 13

@gibsonsg

 

Thank you, it's much clearer. Yet it still doesn't explain your proposed output because all of these conditions are met for every row:

 

  • Your ID values are identical in all rows, so for any two rows i and j ID[i]=ID[j] is always true.
  • The Allocate variable is positive in every row, hence for every starting row i, Allocate[i]>0 is always true.
  • The values of Usage are positive for all rows, thus for every range between row i and j>i inclusive, the sum of Usage from Usage[i] to Usage[j] is > 0. 

 

For starters, could you explain what combination of your conditions causes FINAL to become 0 in output row 5?

 

Paul D.

Shmuel
Garnet | Level 18

Instead of looking forward to a late row, you can sort the data by descending cohort (seems to be a date of yyyymm format)

then use lag function to peak at a previous rows (lag1 or lag2 or lag3 ...).

Try it and post your code in case of any issue.

gibsonsg
Calcite | Level 5

Hi again.

After some research, I found that intnx function is the another handy method to solve the problem.

I hope it would be useful.

 

Input Table

 

ID

COHORT

ALLOCATE

USAGE

1

1001

201601

15.049

6.766

2

1001

201602

13.507

5.088

3

1001

201603

11.872

9.352

4

1001

201604

13.791

7.404

5

1001

201605

0

8.866

6

1001

201606

16.906

8.897

7

1001

201607

13.706

0

8

1001

201608

11.543

0

9

1001

201609

12.457

0

10

1001

201610

14.250

0

11

1001

201611

11.148

0

12

1001

201612

19.648

0

 

Codes

For the solution, DATE column was created for each observation in Cohort column.

(201601=31.01.2016, 201602=29.02.2016 etc.)

 

proc sql;

create table work.vertical_sum as

select

ID,

COHORT,

DATE,

ALLOCATE,

SUM(USAGE) as TOTAL_USAGE

from

(select t1.ID, t1.COHORT, t1.DATE, t1.ALLOCATE, t2.USAGE from work.vertical as t1

left join work.vertical as t2

on (t1.ID=t2.ID

and

t2.DATE between intnx('month', t1.DATE, 0)  and  intnx('month', t1.DATE, 6)))

group by ID, COHORT

order by ID, COHORT;

 

create table work.final_sum as

select

t1.ID,

t1.COHORT,

t1.ALLOCATE,

t2.TOTAL_USAGE,

Case When t1.ALLOCATE > 0 and t2.TOTAL_USAGE > 0 Then 1 Else 0 End as FINAL

from work.vertical_sum as t1

inner join work.vertical_sum as t2

on t1.ID=t2.ID and t1.COHORT=t2.COHORT

order by t1.ID, t1.COHORT;

quit;

 

proc sort data=work.final_sum out=work.final_sum nodupkey;

by cohort; 

run;

 

Result Table

 

ID

COHORT

DATE

ALLOCATE

TOTAL_USAGE

FINAL

1

1001

201601

31.01.2016

15.049

46.373

1

2

1001

201602

29.02.2016

13.507

39.607

1

3

1001

201603

31.03.2016

11.872

34.519

1

4

1001

201604

30.04.2016

13.791

25.167

1

5

1001

201605

31.05.2016

0

17.763

0

6

1001

201606

30.06.2016

16.906

8.897

1

7

1001

201607

31.07.2016

13.706

0

0

8

1001

201608

31.08.2016

11.543

0

0

9

1001

201609

30.09.2016

12.457

0

0

10

1001

201610

31.10.2016

14.25

0

0

11

1001

201611

30.11.2016

11.148

0

0

12

1001

201612

31.12.2016

19.648

0

0

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1490 views
  • 3 likes
  • 3 in conversation