DATA Step, Macro, Functions and more

Reduce datasteps in count and sum

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

Reduce datasteps in count and sum

[ Edited ]

Hi

 

I hope you can help me. This is the output from the code below and I was hoping there is a smart way to reduce the datasteps. There

are other variable in the original dataset but the variables below are the ones important for the coding.

 

Thank you in advance.

 

OBJECT_ID Case_ID Case_count Case_count_sum

10104PROD21C4CE 1 1 2
10104PROD21C4CE 1 2 2
10201PROD230948 2 1 2
10201PROD230948 2 2 2
10207PROD2315A8 3 1 3
10207PROD2315A8 3 2 3
10207PROD2315A8 3 3 3
10280PROD23FEB7 4 1 2
10280PROD23FEB7 4 2 2
10294PROD242336 5 1 2
10294PROD242336 5 2 2
10300PROD242B33 6 1 2
10300PROD242B33 6 2 2
10300PROD242CE7 7 1 2
10300PROD242CE7 7 2 2
10302PROD243380 8 1 3
10302PROD243380 8 2 3
10302PROD243380 8 3 3
10347PROD2495CD 9 1 1
10347PROD2495E8 10 1 2
10347PROD2495E8 10 2 2
10349PROD249971 11 1 1
10356PROD24A468 12 1 2
10356PROD24A468 12 2 2
11005PROD24DB55 13 1 2
11005PROD24DB55 13 2 2
11006PROD24DEA6 14 1 3
11006PROD24DEA6 14 2 3
11006PROD24DEA6 14 3 3
11012PROD24ED64 15 1 2
11012PROD24ED64 15 2 2
11013PROD24F0D4 16 1 1
11013PROD24F0DA 17 1 2
11013PROD24F0DA 17 2 2
11017PROD24F6D6 18 1 2
11017PROD24F6D6 18 2 2
11024PROD250F13 19 1 1
11027PROD251616 20 1 2
11027PROD251616 20 2 2
11028PROD2518E3 21 1 2

 

 

 

data testm;

set X;

by OBJECT_ID;

retain OBJECT_ID;

if first.OBJECT_ID then Case_ID+1;

run;

 

data testm2;

set testm;

by Case_ID;

Case_count+1;

if first.Case_ID then Case_count=1;

run;

 

proc sort data=testm2;

by Case_ID descending Case_count;

run;

 

data testm3;

set testm2;

by Case_ID;

retain Case_count_sum;

if first.Case_ID then Case_count_sum=Case_count;

output;

run;


Accepted Solutions
Solution
‎05-02-2016 09:50 AM
Trusted Advisor
Posts: 1,115

Re: Reduce datasteps in count and sum

Here is a solution using the so called DOW loop technique:

data want;
do until(last.object_id);
  set x;
  by object_id;
  Case_count_sum=sum(case_count_sum,1);
end;
do until(last.object_id);
  set x;
  by object_id;
  Case_ID+first.object_id;
  Case_count=sum(case_count,1);
  output;
end;
run;

If you don't need Case_ID or Case_count, you can simply delete the respective line in the code where these are calculated.

View solution in original post


All Replies
Trusted Advisor
Posts: 1,115

Re: Reduce datasteps in count and sum

Hi @piddy, welcome to the forum,

 

Yes, your code could be simplified. For a specific suggestion it would be good to know how your final dataset should look like. Is the goal exactly the "output" you've posted? Or are, for example, Case_ID and Case_count only means to the end of creating Case_count_sum and you don't actually need them?

Solution
‎05-02-2016 09:50 AM
Trusted Advisor
Posts: 1,115

Re: Reduce datasteps in count and sum

Here is a solution using the so called DOW loop technique:

data want;
do until(last.object_id);
  set x;
  by object_id;
  Case_count_sum=sum(case_count_sum,1);
end;
do until(last.object_id);
  set x;
  by object_id;
  Case_ID+first.object_id;
  Case_count=sum(case_count,1);
  output;
end;
run;

If you don't need Case_ID or Case_count, you can simply delete the respective line in the code where these are calculated.

Occasional Contributor
Posts: 10

Re: Reduce datasteps in count and sum

HI freelanceReinhard

 

Thank you very much for your reply.

 

When I run your code there seems to be a problem with case_ID is doesn't count properly.

Is it possible to rewrite it as a macro?

 

 


case_id.JPG
Trusted Advisor
Posts: 1,115

Re: Reduce datasteps in count and sum

Hi @piddy,

 

I cannot replicate your issue. When I run my code with the following test data, I obtain the result shown further below:

data x;
input OBJECT_ID $15.;
cards;
10104PROD21C4CE
10104PROD21C4CE
10201PROD230948
10201PROD230948
10207PROD2315A8
10207PROD2315A8
10207PROD2315A8
;

Result:

want.png

 

If you still encounter issues, please post suitable test data in the form of a data step (as I did above) and the program code (unless you left my code unchanged).

Occasional Contributor
Posts: 10

Re: Reduce datasteps in count and sum

Hi again

 

Your solution was correct. Thank you so much again. I made a mistake in dropping some variabels. I'm not so good at macros yet but how could I rewrite your code to a macro?

Trusted Advisor
Posts: 1,115

Re: Reduce datasteps in count and sum


piddy wrote:

I'm not so good at macros yet but how could I rewrite your code to a macro?


First, you should clarify if you really need a macro. I've seen all too many posts in this forum with (failed) attempts to use macro language where other techniques (mostly data and proc steps) would have been more appropriate. Maybe your case justifies writing a macro. If so, you have to decide which parts of the existing data step code are to be made flexible by using macro parameters.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 256 views
  • 0 likes
  • 2 in conversation