BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
esvenson
Fluorite | Level 6

Hello,

I have data arranged like this and I'm trying to count the number of non-zero occurrences for each ID value.  I'm trying to do this without using a PROC step and to only use a DATA step.  Please forgive me if this question has been solved elsewhere - I couldn't find this in the community.

 

ID                   A        B

1000               0        1

1000               0        1

1000               1        0

1111               1        0

1111               1        0

1111               1        0

1111               0        1

1111               0        1

2222               0        1

2222               0        1

2222               1        0

2222               1        0

2222               0        1

 

The final result set should look this:

ID                   A        B

1000               1        2

1111               3        2

2222               2        3

 

Your help is greatly appreciated - thank you in advance!

~E

1 ACCEPTED SOLUTION

Accepted Solutions
slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

View solution in original post

10 REPLIES 10
LinusH
Tourmaline | Level 20
Why not a PROC if that's easier/better?
Data never sleeps
esvenson
Fluorite | Level 6

It needs to be a data step because the results will later be combined into another data step.  There are other variables that will need to be in the dataset, I was just trying to keep it simple showing only those three generic ones.  I don't really want to use a PROC and unsure if it's possible to use the results of a PROC step into another DATA step.

 

Thanks,

E

PaigeMiller
Diamond | Level 26

@esvenson wrote:

It needs to be a data step because the results will later be combined into another data step. 


Well, that's not the way SAS works. You can run a PROC, have the results in a SAS data set, and then that data set combined with other data in a data step.

 

<blockquote>There are other variables that will need to be in the dataset</blockquote>

 

Yep. PROCs work here too. See my paragraph above. There's nothing you have said that forces you to use DATA step only.

--
Paige Miller
esvenson
Fluorite | Level 6

Ok, good to know.  Do you have an idea on how to do it?

 

Thanks for your help,

E

PaigeMiller
Diamond | Level 26

You could certainly do it in a data step if it was sorted first ... but that requires PROC SORT ... I guess you can't do this without a PROC.

--
Paige Miller
esvenson
Fluorite | Level 6

I have no problem using a PROC SORT first - I only meant that the final result would be unusable for what I later need as a PROC MEANS or PROC FREQ.  Hope this clarifies.

 

E

slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

esvenson
Fluorite | Level 6
Dear slchen,
Thank you so much for taking the time to provide different ways of doing this. I tried all three and they all worked great. Thanks again.

Have a great day,
~E
SuryaKiran
Meteorite | Level 14

You can use RETAIN statement and FIRST. and LAST. , Something like this....

 

DATA WANT;
INFILE DATALINES;
INPUT ID A B;
DATALINES;
1000 0 1
1000 0 1
1000 1 0
1111 1 0
1111 1 0
1111 1 0
1111 0 1
1111 0 1
2222 0 1
2222 0 1
2222 1 0
2222 1 0
2222 0 1
;
RUN;
PROC SORT DATA=WANT;
BY ID;
RUN;

DATA WANT1(DROP=A B);
SET WANT;
RETAIN Count_A Count_B;
BY ID ;
IF FIRST.ID THEN DO;count_A=A;Count_B=B;END;
ELSE DO;Count_A=Count_A+A;Count_B=Count_B+B;END;
IF LAST.ID THEN OUTPUT;
RUN;
Thanks,
Suryakiran
Astounding
PROC Star

Just in case this opens new avenues of thinking for you, here is one way to use a PROC to create a SAS data set:

 

proc summary data=have nway;

class ID;

var A B;

output out=want (drop=_type_ _freq_) sum=;

run;

 

The WANT data set should match what you are seeking here, with one issue to consider.  You asked for a count of the non-zero values.  This is actually something a little bit different ... the sum of all the values.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 22071 views
  • 2 likes
  • 6 in conversation