BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
esvenson
Fluorite | Level 6

Hello,

I have data arranged like this and I'm trying to count the number of non-zero occurrences for each ID value.  I'm trying to do this without using a PROC step and to only use a DATA step.  Please forgive me if this question has been solved elsewhere - I couldn't find this in the community.

 

ID                   A        B

1000               0        1

1000               0        1

1000               1        0

1111               1        0

1111               1        0

1111               1        0

1111               0        1

1111               0        1

2222               0        1

2222               0        1

2222               1        0

2222               1        0

2222               0        1

 

The final result set should look this:

ID                   A        B

1000               1        2

1111               3        2

2222               2        3

 

Your help is greatly appreciated - thank you in advance!

~E

1 ACCEPTED SOLUTION

Accepted Solutions
slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

View solution in original post

10 REPLIES 10
LinusH
Tourmaline | Level 20
Why not a PROC if that's easier/better?
Data never sleeps
esvenson
Fluorite | Level 6

It needs to be a data step because the results will later be combined into another data step.  There are other variables that will need to be in the dataset, I was just trying to keep it simple showing only those three generic ones.  I don't really want to use a PROC and unsure if it's possible to use the results of a PROC step into another DATA step.

 

Thanks,

E

PaigeMiller
Diamond | Level 26

@esvenson wrote:

It needs to be a data step because the results will later be combined into another data step. 


Well, that's not the way SAS works. You can run a PROC, have the results in a SAS data set, and then that data set combined with other data in a data step.

 

<blockquote>There are other variables that will need to be in the dataset</blockquote>

 

Yep. PROCs work here too. See my paragraph above. There's nothing you have said that forces you to use DATA step only.

--
Paige Miller
esvenson
Fluorite | Level 6

Ok, good to know.  Do you have an idea on how to do it?

 

Thanks for your help,

E

PaigeMiller
Diamond | Level 26

You could certainly do it in a data step if it was sorted first ... but that requires PROC SORT ... I guess you can't do this without a PROC.

--
Paige Miller
esvenson
Fluorite | Level 6

I have no problem using a PROC SORT first - I only meant that the final result would be unusable for what I later need as a PROC MEANS or PROC FREQ.  Hope this clarifies.

 

E

slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

esvenson
Fluorite | Level 6
Dear slchen,
Thank you so much for taking the time to provide different ways of doing this. I tried all three and they all worked great. Thanks again.

Have a great day,
~E
SuryaKiran
Meteorite | Level 14

You can use RETAIN statement and FIRST. and LAST. , Something like this....

 

DATA WANT;
INFILE DATALINES;
INPUT ID A B;
DATALINES;
1000 0 1
1000 0 1
1000 1 0
1111 1 0
1111 1 0
1111 1 0
1111 0 1
1111 0 1
2222 0 1
2222 0 1
2222 1 0
2222 1 0
2222 0 1
;
RUN;
PROC SORT DATA=WANT;
BY ID;
RUN;

DATA WANT1(DROP=A B);
SET WANT;
RETAIN Count_A Count_B;
BY ID ;
IF FIRST.ID THEN DO;count_A=A;Count_B=B;END;
ELSE DO;Count_A=Count_A+A;Count_B=Count_B+B;END;
IF LAST.ID THEN OUTPUT;
RUN;
Thanks,
Suryakiran
Astounding
PROC Star

Just in case this opens new avenues of thinking for you, here is one way to use a PROC to create a SAS data set:

 

proc summary data=have nway;

class ID;

var A B;

output out=want (drop=_type_ _freq_) sum=;

run;

 

The WANT data set should match what you are seeking here, with one issue to consider.  You asked for a count of the non-zero values.  This is actually something a little bit different ... the sum of all the values.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 20206 views
  • 2 likes
  • 6 in conversation