BookmarkSubscribeRSS Feed
BayzidurRahman
Obsidian | Level 7

 I have two variables id and dif. I need to create sum of variable dif by id until the value of dif is 1. If dif>1 then counting will restart from there and continue until dif>1 or id is different.

id

dif

Sum

1

1

7

1

1

7

1

1

7

1

1

7

1

1

7

1

1

7

1

1

7

1

2

6

1

1

6

1

1

6

1

1

6

1

1

6

1

1

6

2

1

3

2

1

3

2

1

3

2

3

3

2

1

3

2

1

3

2

2

2

2

1

2

 

11 REPLIES 11
PaigeMiller
Diamond | Level 26

 I need to create sum of variable dif by id until the value of dif is 1

 

Should that really say "... until the value of dif is not equal to 1" ??

--
Paige Miller
BayzidurRahman
Obsidian | Level 7

No, it should say dif>1 because in some situations the dif can be 0.
Secondly, the sum variable should take to sum of rows.

PaigeMiller
Diamond | Level 26

Why is the value of variable SUM of the second group equal to 6? It seems to me that the sum of variable DIF in the second group is 7.

 

2 + 1 + 1 + 1+ 1 + 1 = 7

 

Similar question applies to later groups as well

--
Paige Miller
BayzidurRahman
Obsidian | Level 7

Sorry- I clarified it in my previous reply. The sum should be total number of rows.

PaigeMiller
Diamond | Level 26

This code is untested. If you want tested code, please provide the data as working SAS data step code (examples and instructions).

 

/* UNTESTED CODE */
data intermediate;
    set have;
    by id;
    if first.id or dif>1 then group+1;
run;
proc summary nway data=intermediate;
    class group;
    var dif;
    output out=counts n=sum;
run;
data want;
    merge intermediate counts(keep=group sum);
    by id;
run;

 

This assumes the original data set is sorted by ID.

 

Suggestion: if the value you want is a count and not a sum, do not name the variable SUM

--
Paige Miller
Tom
Super User Tom
Super User

You seem to be making GROUPS, but you don't actually create a variable to indicate the group number.

data want;
  set have;
  by id;
  if first.id then group=1;
  else if dif > 1 then group+1;
run;

Once you have the group then counting how may observations are in each group is easy.

proc sql;
create table want2 as 
  select *, count(*) as count
  from want
  group by id,group
;
quit;

Result

Obs    id    dif    Sum    group    count

  1     1     1      7       1        7
  2     1     1      7       1        7
  3     1     1      7       1        7
  4     1     1      7       1        7
  5     1     1      7       1        7
  6     1     1      7       1        7
  7     1     1      7       1        7
  8     1     1      6       2        6
  9     1     1      6       2        6
 10     1     1      6       2        6
 11     1     1      6       2        6
 12     1     1      6       2        6
 13     1     2      6       2        6
 14     2     1      3       1        3
 15     2     1      3       1        3
 16     2     1      3       1        3
 17     2     1      3       2        3
 18     2     1      3       2        3
 19     2     3      3       2        3
 20     2     1      2       3        2
 21     2     2      2       3        2
BayzidurRahman
Obsidian | Level 7

Thanks Tom.

I have another condition- if dif =0 or missing then that row should not be counted in the sum.

id

dif

Sum

1

1

6

1

1

6

1

0

6

1

1

6

1

1

6

1

1

6

1

1

6

1

2

6

1

1

6

1

1

6

1

1

6

1

1

6

1

1

6

2

1

2

2

.

2

2

1

2

2

3

3

2

1

3

2

1

3

2

2

2

2

1

2

Tom
Super User Tom
Super User

Just change the counting step.  Instead of COUNT(*) you could use SUM( dif not in (0 .) )

BayzidurRahman
Obsidian | Level 7

Thanks Tom. That gave me required clue. I modified the code to ignore 0 and missing in the dif variable.

 

data want;
*retain id group dif count;
  set have (drop= sum);
  by id;
  if first.id then gr=1;
  else if (dif > 2 & dif ne .) then gr+1;
 count=1;
   if dif = . or dif = 0 then count=.;
run;

proc sql;
create table want2 as 
  select *, SUM( count ) as dur
  from want
  group by id,gr
;
quit;

But one thing I noticed that in the want2 dataset it is changing the order of observations in the dif variable. Number 3 is appearing on the last row in dif. Is there any way to maintain the order as in the want dataset?

 

 

BayzidurRahman_0-1693297367276.png

 

 

mkeintz
PROC Star

Pass through each GROUP twice, once to get the group size, and once to write output with the new COUNT variable.  GROUP is defined as a sequence of obs ending with either the last obs for a given ID, or when the next observation has DIF>1:

 

data have;
  input id dif sum;
datalines;
1	1	7
1	1	7
1	1	7
1	1	7
1	1	7
1	1	7
1	1	7
1	2	6
1	1	6
1	1	6
1	1	6
1	1	6
1	1	6
2	1	3
2	1	3
2	1	3
2	3	3
2	1	3
2	1	3
2	2	2
2	1	2
run;
data want (drop=_:); 
  do count=1 by 1 until (last.id or _next_dif>1);
    set have;
    by id;
    merge have have (firstobs=2 keep=dif rename=(dif=_next_dif));
  end;
  do _n_=1 to count;
    set have;
    output;
  end;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
BayzidurRahman
Obsidian | Level 7

Thanks.

How can I exclude counting any rows where the value of dif is either 0 or missing?

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 883 views
  • 0 likes
  • 4 in conversation