BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
thummala
Obsidian | Level 7

Hi,

The following is my data set and I am trying to get the output as shown below using ONLY data step.

data email;

input custid $ date MMDDYY8. response $;

format date MMDDYY10.;

datalines;

A 12-22-10 Y

B 09-18-07 N

C 08-11-11 Y

A 03-19-03 Y

B 09-01-01 N

C 08-03-12 N

D 03-01-04 Y

A 11-08-08 Y

B 03-09-09 Y

C 09-06-09 Y

E 09-08-09 Y

A 09-09-09 Y

C 09-06-09 Y

B 09-01-06 N

E 09-03-12 Y

D 09-23-11 Y

A 01-01-00 Y

C 09-08-02 Y

D 09-07-00 Y

F 09-09-09 Y

B 09-01-00 N

C 01-01-00 N

;

run;

The out put should be like:

Custid    count

A            5

B            5

C            6

D            3

E            2

F            1

1 ACCEPTED SOLUTION

Accepted Solutions
esjackso
Quartz | Level 8

Try this:

data cnt2;

  keep custid count;

  set email;

  select (custid);

  when ("A") A + 1;

  when ("B") B + 1;

  when ("C") C + 1;

  when ("D") D + 1;

  when ("E") E + 1;

  when ("F") F + 1;

  otherwise;

  end;

  if _n_ = 22 then do;

  custid = "A";

  count = A;

  output;

  custid = "B";

  count = B;

  output;

  custid = "C";

  count = C;

  output;

  custid = "D";

  count = D;

  output;

  custid = "E";

  count = E;

  output;

  custid = "F";

  count = F;

  output;

  end;

run;

View solution in original post

9 REPLIES 9
VD
Calcite | Level 5 VD
Calcite | Level 5

Easiest is to use PROC FREQ

proc freq data=email ;

table custid/NOPERCENT NOCUM out=data1;

run;

allurai0412
Fluorite | Level 6

hi ,

just provdiding and idea .,..not code

you can use count (response ) ...and followed by group ,,....in  Proc SQL;

regards

ALLU

esjackso
Quartz | Level 8

Not sure why you need just a data step, but below requires the data to be sorted before the data step:

proc sort data=email; by custid; run;

data cnt;

  drop date response;

  set email;

  by custid;

  if first.custid then count = 1;

  else count + 1;

  if last.custid then output;

run;

thummala
Obsidian | Level 7

Hi ESJACKSO

I am a new user to SAS and my instructor asked me to solve it using data steps only. Looking for some hints to solve it.

Thank You.

esjackso
Quartz | Level 8

Try this:

data cnt2;

  keep custid count;

  set email;

  select (custid);

  when ("A") A + 1;

  when ("B") B + 1;

  when ("C") C + 1;

  when ("D") D + 1;

  when ("E") E + 1;

  when ("F") F + 1;

  otherwise;

  end;

  if _n_ = 22 then do;

  custid = "A";

  count = A;

  output;

  custid = "B";

  count = B;

  output;

  custid = "C";

  count = C;

  output;

  custid = "D";

  count = D;

  output;

  custid = "E";

  count = E;

  output;

  custid = "F";

  count = F;

  output;

  end;

run;

esjackso
Quartz | Level 8

Now the above code isnt very flexible but does provide the output using only a datastep. In fact the outputs could be coded using macro code to shorten the code a little. But for this to work you have to know all the possible custid values going in .... the other solutions posted (freq, sql, and using a sort with a by statement) all offer better solutions for a counting situation.

hope this helps!

EJ

AncaTilea
Pyrite | Level 9

Very nice!

Thank you.

I never knew of select...when...otherwise...

!!!!

Excited!

Thanks!

Anca.

AncaTilea
Pyrite | Level 9

Here is a piece of code that (almost) accomplishes what you want (I can't understand why you are not allowed to use proc sort?)

The code is based on the data you provided, and it is very 'manual' (or lots of hard-coded values) (so, again, there are quicker and better ways to accomplish this):

data want;

    set email;

    

*create a flag for each ID occurrence;

    tempA = (custId = "A");

    tempB = (custId = "B");

    tempC = (custId = "C");

    tempD = (custId = "D");

    tempE = (custId = "E");

    tempF = (custId = "F");

*the retain will help 'retain' a value until a new 'instance/value' is met;

    retain total_A 0 total_B 0 total_C 0 total_D 0 total_E 0 total_F 0;

  *sum down the rows;

    total_A=sum(total_A,tempA);

    total_B=sum(total_B,tempB);

    total_C=sum(total_C,tempC);

    total_D=sum(total_D,tempD);

    total_E=sum(total_E,tempE);

    total_F=sum(total_F,tempF);

    drop temp:;  

by_var= 1;*to use in the next data step;

run;

data may_want(Keep = total:);

    set want;

    by by_var;

    if last.temp;

run;

So the final data is not really in the same 'shape' as what you really want.

But these are my thoughts.

Good luck!

Anca.

data_null__
Jade | Level 19

The HASH object allows you to create a dynamic array indexed on CUSTID and accumulate the counts.

data email;
   infile datalines eof=eof;
   input custid $ date MMDDYY8. response $;
   length count 8;
  
if _n_ eq 1 then do;
     
declare hash h(ordered:'Y');
      h.definekey('custid');
      h.definedata('custid','count');
      h.definedone();
     
end;
  
call missing(count);
   rc=h.find();
   count +
1;
   rc=h.replace();
  
return;
eof:
   h.output(dataset:
'count');
   stop;
  
format date MMDDYY10.;
  
drop count rc;
   datalines;
A 12-22-10 Y
B 09-18-07 N
C 08-11-11 Y
A 03-19-03 Y
B 09-01-01 N
C 08-03-12 N
D 03-01-04 Y
A 11-08-08 Y
B 03-09-09 Y
C 09-06-09 Y
E 09-08-09 Y
A 09-09-09 Y
C 09-06-09 Y
B 09-01-06 N
E 09-03-12 Y
D 09-23-11 Y
A 01-01-00 Y
C 09-08-02 Y
D 09-07-00 Y
F 09-09-09 Y
B 09-01-00 N
C 01-01-00 N
;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1335 views
  • 7 likes
  • 6 in conversation