BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cho16
Obsidian | Level 7

Hi ,

 

I have a data set that has the ID, Business,LOB,ACTSIZ. like below example

 

data DATA;  

length id $ 5 Business $ 2 LOB $ 3;

input id $ Business LOB $ ACTSIZ;

datalines;

12340 01 RTY 23

12341 20 TYR 0

12341 30 1UY 8

12343 30 CBP 8

12343 30 MTS 8

12343 30 HPL 8

12343 45 HFI 6

12343 45 HFS 6

45673 01 RTY 40

45673 30 IUY 24

45673 30 CBP 24

45673 30 IUY 24

45673 30 CBP 24

;

run;

 

I was summing ACTSIZ column group by id to get the totalsize and output looks like below ,which is wrong (seprate each id by line for easy readability)::

id Business LOB ACTSIZ Totalsize

12340 01 RTY 23 23

12341 20 TYR 0 8

12341 30 1UY 8 8

12343 30 CBP 8 36

12343 30 MTS 8 36

12343 30 HPL 8 36

12343 45 HFI 6 36

12343 45 HFS 6 36

45673 01 RTY 40 136

45673 30 IUY 24 136

45673 30 CBP 24 136

45673 30 IUY 24 136

45673 30 CBP 24 136

 

If the business id and actual size repeat more than once (LOB will be different) then count the actual size one time toward total size. Business want the output like below (seprate each id by line for easy readability):

id Business LOB ACTSIZ Totalsize 

12340 01 RTY 23 23

12341 20 TYR 0 8

12341 30 1UY 8 8

12343 30 CBP 8 14

12343 30 MTS 8 14

12343 30 HPL 8 14

12343 45 HFI 6 14

12343 45 HFS 6 14

45673 01 RTY 40 64

45673 30 IUY 24 64

45673 30 CBP 24 64

45673 30 IUY 24 64

45673 30 CBP 24 64

 

Note: actual dataset has 150 columns and we need all those column in the output, so we can't use nodupkey option.

 

Thank you so much for your help.

 

-Sudha.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Like this?

 

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
run;
data _V/view=_V;
  set HAVE;
  by ID BUSINESS;
  VALUE=( first.BUSINESS * ACTSIZ);
run;
proc sql;
   select a.*, SUM
   from _V                                                   a
       ,(select ID, sum(VALUE) as SUM from _V group by 1)    b
   where a.ID=b.ID ;
run;

 

ID BUSINESS LOB ACTSIZ ACTSIZ
12340 01 RTY 23 23
12341 20 TYR 0 8
12341 30 1UY 8 8
12343 30 CBP 8 14
12343 30 MTS 8 14
12343 30 HPL 8 14
12343 45 HFI 6 14
12343 45 HFS 6 14
45673 01 RTY 40 64
45673 30 IUY 24 64
45673 30 CBP 24 64
45673 30 IUY 24 64
45673 30 CBP 24 64

View solution in original post

3 REPLIES 3
ShiroAmada
Lapis Lazuli | Level 10

Try this....

 

data WANT;
  set HAVE;
group=catx('-',ID, BUSINESS, ACTSIZ);

grouplag=lag(group);

if (group ^= grouplag) or _n_=1 then do;
  totalsize=actsiz;
 is_this_what_you_want='Y';
end;

run;

Hope this helps.

ChrisNZ
Tourmaline | Level 20

Like this?

 

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
run;
data _V/view=_V;
  set HAVE;
  by ID BUSINESS;
  VALUE=( first.BUSINESS * ACTSIZ);
run;
proc sql;
   select a.*, SUM
   from _V                                                   a
       ,(select ID, sum(VALUE) as SUM from _V group by 1)    b
   where a.ID=b.ID ;
run;

 

ID BUSINESS LOB ACTSIZ ACTSIZ
12340 01 RTY 23 23
12341 20 TYR 0 8
12341 30 1UY 8 8
12343 30 CBP 8 14
12343 30 MTS 8 14
12343 30 HPL 8 14
12343 45 HFI 6 14
12343 45 HFS 6 14
45673 01 RTY 40 64
45673 30 IUY 24 64
45673 30 CBP 24 64
45673 30 IUY 24 64
45673 30 CBP 24 64
Ksharp
Super User

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
;
run;

proc sql;
select *
 from (select * from have ) as a left join
      (select id,sum(ACTSIZ) as sum_ACTSIZ from (select distinct id,ACTSIZ from have) group by id) as b
  on a.id=b.id;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 892 views
  • 0 likes
  • 4 in conversation