BookmarkSubscribeRSS Feed
thanikondharish
Calcite | Level 5

data sample;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1

1001 aaa 2020-11-01 . 1 class2
1001 aaa 2020-11-01 . 2 class2
1001 aaa 2020-11-01 . 3 class2
1001 aaa 2020-11-01 first 4 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
;
run;

how to create sequence number between first and last values in mark variable and how to count stagen='9'

for your reference see below outupt

  id     type   date    mark  stagen class    seq      state9(count)

1001 aaa 2020-11-01 .      1 class1
1001 aaa 2020-11-01 .      2 class1
1001 aaa 2020-11-01 .     3 class1
1001 aaa 2020-11-01 first 4 class1        4
1001 aaa 2020-11-01 .     9 class1         5              2
1001 aaa 2020-11-01 .     9 class1         6              2
1001 aaa 2020-11-01 last 8 class1         7

1001 aaa 2020-11-01 .      1 class2
1001 aaa 2020-11-01 .      2 class2
1001 aaa 2020-11-01 .     3 class2
1001 aaa 2020-11-01 first 4 class2        4
1001 aaa 2020-11-01 .     9 class2        5              2
1001 aaa 2020-11-01 .     9 class2        6              2
1001 aaa 2020-11-01 last 8 class2        7

in above output between seq 4 and 7 there are state=9 two records are there these count reflect next

column count=2

by variables(id,type,class)

12 REPLIES 12
PaigeMiller
Diamond | Level 26

how to count state='9'

 

There is no variable named STATE in your input data.

 

Please include code in a code box by clicking on the "little running man" icon and pasting your code into that box. That will help all of us.

--
Paige Miller
thanikondharish
Calcite | Level 5
state=9 is nothing but stagen=9
spelling mistake
PaigeMiller
Diamond | Level 26

Please include code in a code box by clicking on the "little running man" icon and pasting your code into that box. That will help all of us.

--
Paige Miller
Kurt_Bremser
Super User

What is your group?

  • id
  • id and type
  • id, type and date

?

You want the count of 9s in those observations that have stagen = 9; do you want the whole count of 9s within a group (see above), or count only successive 9s?

 

Oh yes, and please post your code in proper manner, it won't make your head explode. Promised.

Kurt_Bremser
Super User

This code creates your expected result:

data have;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
;

data want;
do until (last.type);
  set have;
  by id type;
  count9 = sum(count9,(stagen = 9));
end;
do until (last.type);
  set have;
  by id type;
  if mark = "first"
  then seq = stagen;
  else if seq ne . then seq = seq + 1;
  state9 = ifn(stagen = 9,count9,.);
  output;
end;
drop count9;
run;
thanikondharish
Calcite | Level 5
data have;
input id type $ date mark $ stagen class $;
informat date yymmdd10.;
format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2 1001 aaa 2020-11-01 . 2 class2 1001 aaa 2020-11-01 . 3 class2 1001 aaa 2020-11-01 first 4 class2 1001 aaa 2020-11-01 . 9 class2 1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
;

I have one dataset like above how to get output with count column like below dataset?
Your ref:

id type date mark stagen class count
1001 aaa 2020-11-01 . 1 class1 1001 aaa 2020-11-01 . 2 class1 1001 aaa 2020-11-01 . 3 class1 1001 aaa 2020-11-01 first 4 class1 1001 aaa 2020-11-01 . 9 class1 2 1001 aaa 2020-11-01 . 9 class1 2 1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2 1001 aaa 2020-11-01 . 2 class2 1001 aaa 2020-11-01 . 3 class2 1001 aaa 2020-11-01 first 4 class2 1001 aaa 2020-11-01 . 9 class2 3 1001 aaa 2020-11-01 . 9 class2 3
1001 aaa 2020-11-01 . 9 class2 3 1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2

Note: between mark='first'  and mark='last' there are stagen=9 related records there.
How to take those count like see above output dataset
by variables are id,type,class

 

mkeintz
PROC Star

Your WANT dataset does not match your HAVE dataset.  The second pair of   "first" ... "last" records has 3 intermediate records in WANT, but only 2 in HAVE.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
thanikondharish
Calcite | Level 5
I updated raw data.
thanikondharish
Calcite | Level 5
In 15 row entire record is space
We can remove
mkeintz
PROC Star

Thank you for providing sample data in a working DATA step:

 

data have;
  input id type $ date mark $ stagen class $;
  informat date yymmdd10.;
  format date yymmdd10.;
cards;
1001 aaa 2020-11-01 . 1 class1
1001 aaa 2020-11-01 . 2 class1
1001 aaa 2020-11-01 . 3 class1
1001 aaa 2020-11-01 first 4 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 last 8 class1
1001 aaa 2020-11-01 . 9 class1
1001 aaa 2020-11-01 . 1 class2
1001 aaa 2020-11-01 . 2 class2
1001 aaa 2020-11-01 . 3 class2
1001 aaa 2020-11-01 first 4 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 last 8 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
1001 aaa 2020-11-01 . 9 class2
data want (drop=_: i);
  /*read and count every record until "last" */
  do _n=1 by 1 until (mark='last    ' or last.type);
    set have;
    by id type;
    if mark='first' then _nfirst=_n; /*Note position of "first"*/
  end;

  /* Reread, recount, and output the same records */
  do i=1 to _n;
    set have;
    if _nfirst=. or i<=_nfirst or i=_n then count=. ;
    else count=(_n-_nfirst-1);
    output;
  end;
run;
  
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 956 views
  • 0 likes
  • 4 in conversation