BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
esvenson
Fluorite | Level 6

Hello,

I have data arranged like this and I'm trying to count the number of non-zero occurrences for each ID value.  I'm trying to do this without using a PROC step and to only use a DATA step.  Please forgive me if this question has been solved elsewhere - I couldn't find this in the community.

 

ID                   A        B

1000               0        1

1000               0        1

1000               1        0

1111               1        0

1111               1        0

1111               1        0

1111               0        1

1111               0        1

2222               0        1

2222               0        1

2222               1        0

2222               1        0

2222               0        1

 

The final result set should look this:

ID                   A        B

1000               1        2

1111               3        2

2222               2        3

 

Your help is greatly appreciated - thank you in advance!

~E

1 ACCEPTED SOLUTION

Accepted Solutions
slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

View solution in original post

10 REPLIES 10
LinusH
Tourmaline | Level 20
Why not a PROC if that's easier/better?
Data never sleeps
esvenson
Fluorite | Level 6

It needs to be a data step because the results will later be combined into another data step.  There are other variables that will need to be in the dataset, I was just trying to keep it simple showing only those three generic ones.  I don't really want to use a PROC and unsure if it's possible to use the results of a PROC step into another DATA step.

 

Thanks,

E

PaigeMiller
Diamond | Level 26

@esvenson wrote:

It needs to be a data step because the results will later be combined into another data step. 


Well, that's not the way SAS works. You can run a PROC, have the results in a SAS data set, and then that data set combined with other data in a data step.

 

<blockquote>There are other variables that will need to be in the dataset</blockquote>

 

Yep. PROCs work here too. See my paragraph above. There's nothing you have said that forces you to use DATA step only.

--
Paige Miller
esvenson
Fluorite | Level 6

Ok, good to know.  Do you have an idea on how to do it?

 

Thanks for your help,

E

PaigeMiller
Diamond | Level 26

You could certainly do it in a data step if it was sorted first ... but that requires PROC SORT ... I guess you can't do this without a PROC.

--
Paige Miller
esvenson
Fluorite | Level 6

I have no problem using a PROC SORT first - I only meant that the final result would be unusable for what I later need as a PROC MEANS or PROC FREQ.  Hope this clarifies.

 

E

slchen
Lapis Lazuli | Level 10

Method one:
proc sql;

create table want as
select distinct(ID),sum(A) as A, sum(B) as B from have group by ID;
quit;

 

Method two:

proc summary data=have;
class id;
var A B;
ways 1;
output out=want sum(A B)=A B;
run;

 

Method three:

data want;
set have;
by id;
retain sumA sumB;
if first.id then call missing(sumA,sumB);
sumA+A; sumB+B;
if last.id then output;
run;

esvenson
Fluorite | Level 6
Dear slchen,
Thank you so much for taking the time to provide different ways of doing this. I tried all three and they all worked great. Thanks again.

Have a great day,
~E
SuryaKiran
Meteorite | Level 14

You can use RETAIN statement and FIRST. and LAST. , Something like this....

 

DATA WANT;
INFILE DATALINES;
INPUT ID A B;
DATALINES;
1000 0 1
1000 0 1
1000 1 0
1111 1 0
1111 1 0
1111 1 0
1111 0 1
1111 0 1
2222 0 1
2222 0 1
2222 1 0
2222 1 0
2222 0 1
;
RUN;
PROC SORT DATA=WANT;
BY ID;
RUN;

DATA WANT1(DROP=A B);
SET WANT;
RETAIN Count_A Count_B;
BY ID ;
IF FIRST.ID THEN DO;count_A=A;Count_B=B;END;
ELSE DO;Count_A=Count_A+A;Count_B=Count_B+B;END;
IF LAST.ID THEN OUTPUT;
RUN;
Thanks,
Suryakiran
Astounding
PROC Star

Just in case this opens new avenues of thinking for you, here is one way to use a PROC to create a SAS data set:

 

proc summary data=have nway;

class ID;

var A B;

output out=want (drop=_type_ _freq_) sum=;

run;

 

The WANT data set should match what you are seeking here, with one issue to consider.  You asked for a count of the non-zero values.  This is actually something a little bit different ... the sum of all the values.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 18566 views
  • 2 likes
  • 6 in conversation