BookmarkSubscribeRSS Feed
thanikondharish
Fluorite | Level 6

data sample;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1

1001 aaa 2020-11-01 . 1 class2
1001 aaa 2020-11-01 . 2 class2
1001 aaa 2020-11-01 . 3 class2
1001 aaa 2020-11-01 first 4 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
;
run;

how to create sequence number between first and last values in mark variable and how to count stagen='9'

for your reference see below outupt

  id     type   date    mark  stagen class    seq      state9(count)

1001 aaa 2020-11-01 .      1 class1
1001 aaa 2020-11-01 .      2 class1
1001 aaa 2020-11-01 .     3 class1
1001 aaa 2020-11-01 first 4 class1        4
1001 aaa 2020-11-01 .     9 class1         5              2
1001 aaa 2020-11-01 .     9 class1         6              2
1001 aaa 2020-11-01 last 8 class1         7

1001 aaa 2020-11-01 .      1 class2
1001 aaa 2020-11-01 .      2 class2
1001 aaa 2020-11-01 .     3 class2
1001 aaa 2020-11-01 first 4 class2        4
1001 aaa 2020-11-01 .     9 class2        5              2
1001 aaa 2020-11-01 .     9 class2        6              2
1001 aaa 2020-11-01 last 8 class2        7

in above output between seq 4 and 7 there are state=9 two records are there these count reflect next

column count=2

by variables(id,type,class)

12 REPLIES 12
PaigeMiller
Diamond | Level 26

how to count state='9'

 

There is no variable named STATE in your input data.

 

Please include code in a code box by clicking on the "little running man" icon and pasting your code into that box. That will help all of us.

--
Paige Miller
thanikondharish
Fluorite | Level 6
state=9 is nothing but stagen=9
spelling mistake
PaigeMiller
Diamond | Level 26

Please include code in a code box by clicking on the "little running man" icon and pasting your code into that box. That will help all of us.

--
Paige Miller
Kurt_Bremser
Super User

What is your group?

  • id
  • id and type
  • id, type and date

?

You want the count of 9s in those observations that have stagen = 9; do you want the whole count of 9s within a group (see above), or count only successive 9s?

 

Oh yes, and please post your code in proper manner, it won't make your head explode. Promised.

Kurt_Bremser
Super User

This code creates your expected result:

data have;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
;

data want;
do until (last.type);
  set have;
  by id type;
  count9 = sum(count9,(stagen = 9));
end;
do until (last.type);
  set have;
  by id type;
  if mark = "first"
  then seq = stagen;
  else if seq ne . then seq = seq + 1;
  state9 = ifn(stagen = 9,count9,.);
  output;
end;
drop count9;
run;
thanikondharish
Fluorite | Level 6
data have;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2 1001 aaa 2020-11-01 . 2 class2 1001 aaa 2020-11-01 . 3 class2 1001 aaa 2020-11-01 first 4 class2 1001 aaa 2020-11-01 . 9 class2 1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
;

I have one dataset like above how to get output with count column like below dataset?
Your ref:

id type date mark stagen class count
1001 aaa 2020-11-01 . 1 class1 1001 aaa 2020-11-01 . 2 class1 1001 aaa 2020-11-01 . 3 class1 1001 aaa 2020-11-01 first 4 class1 1001 aaa 2020-11-01 . 9 class1 2 1001 aaa 2020-11-01 . 9 class1 2 1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2 1001 aaa 2020-11-01 . 2 class2 1001 aaa 2020-11-01 . 3 class2 1001 aaa 2020-11-01 first 4 class2 1001 aaa 2020-11-01 . 9 class2 3 1001 aaa 2020-11-01 . 9 class2 3
1001 aaa 2020-11-01 . 9 class2 3 1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2

Note: between mark='first'  and mark='last' there are stagen=9 related records there.
How to take those count like see above output dataset
by variables are id,type,class

 

mkeintz
PROC Star

Your WANT dataset does not match your HAVE dataset.  The second pair of   "first" ... "last" records has 3 intermediate records in WANT, but only 2 in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
thanikondharish
Fluorite | Level 6
I updated raw data.
thanikondharish
Fluorite | Level 6
In 15 row entire record is space
We can remove
mkeintz
PROC Star

Thank you for providing sample data in a working DATA step:

 

data have;
  input id type $ date mark $ stagen class $;
  informat date yymmdd10.;
  format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2
1001 aaa 2020-11-01 . 2 class2
1001 aaa 2020-11-01 . 3 class2
1001 aaa 2020-11-01 first 4 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
data want (drop=_: i);
  /*read and count every record until "last" */
  do _n=1 by 1 until (mark='last    ' or last.type);
    set have;
    by id type;
    if mark='first' then _nfirst=_n; /*Note position of "first"*/
  end;

  /* Reread, recount, and output the same records */
  do i=1 to _n;
    set have;
    if _nfirst=. or i<=_nfirst or i=_n then count=. ;
    else count=(_n-_nfirst-1);
    output;
  end;
run;
  
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2735 views
  • 0 likes
  • 4 in conversation