BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cho16
Obsidian | Level 7

Hi ,

 

I have a data set that has the ID, Business,LOB,ACTSIZ. like below example

 

data DATA;  

length id $ 5 Business $ 2 LOB $ 3;

input id $ Business LOB $ ACTSIZ;

datalines;

12340 01 RTY 23

12341 20 TYR 0

12341 30 1UY 8

12343 30 CBP 8

12343 30 MTS 8

12343 30 HPL 8

12343 45 HFI 6

12343 45 HFS 6

45673 01 RTY 40

45673 30 IUY 24

45673 30 CBP 24

45673 30 IUY 24

45673 30 CBP 24

;

run;

 

I was summing ACTSIZ column group by id to get the totalsize and output looks like below ,which is wrong (seprate each id by line for easy readability)::

id Business LOB ACTSIZ Totalsize

12340 01 RTY 23 23

12341 20 TYR 0 8

12341 30 1UY 8 8

12343 30 CBP 8 36

12343 30 MTS 8 36

12343 30 HPL 8 36

12343 45 HFI 6 36

12343 45 HFS 6 36

45673 01 RTY 40 136

45673 30 IUY 24 136

45673 30 CBP 24 136

45673 30 IUY 24 136

45673 30 CBP 24 136

 

If the business id and actual size repeat more than once (LOB will be different) then count the actual size one time toward total size. Business want the output like below (seprate each id by line for easy readability):

id Business LOB ACTSIZ Totalsize 

12340 01 RTY 23 23

12341 20 TYR 0 8

12341 30 1UY 8 8

12343 30 CBP 8 14

12343 30 MTS 8 14

12343 30 HPL 8 14

12343 45 HFI 6 14

12343 45 HFS 6 14

45673 01 RTY 40 64

45673 30 IUY 24 64

45673 30 CBP 24 64

45673 30 IUY 24 64

45673 30 CBP 24 64

 

Note: actual dataset has 150 columns and we need all those column in the output, so we can't use nodupkey option.

 

Thank you so much for your help.

 

-Sudha.

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Like this?

 

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
run;
data _V/view=_V;
  set HAVE;
  by ID BUSINESS;
  VALUE=( first.BUSINESS * ACTSIZ);
run;
proc sql;
   select a.*, SUM
   from _V                                                   a
       ,(select ID, sum(VALUE) as SUM from _V group by 1)    b
   where a.ID=b.ID ;
run;

 

ID BUSINESS LOB ACTSIZ ACTSIZ
12340 01 RTY 23 23
12341 20 TYR 0 8
12341 30 1UY 8 8
12343 30 CBP 8 14
12343 30 MTS 8 14
12343 30 HPL 8 14
12343 45 HFI 6 14
12343 45 HFS 6 14
45673 01 RTY 40 64
45673 30 IUY 24 64
45673 30 CBP 24 64
45673 30 IUY 24 64
45673 30 CBP 24 64

View solution in original post

3 REPLIES 3
ShiroAmada
Lapis Lazuli | Level 10

Try this....

 

data WANT;
  set HAVE;
group=catx('-',ID, BUSINESS, ACTSIZ);

grouplag=lag(group);

if (group ^= grouplag) or _n_=1 then do;
  totalsize=actsiz;
 is_this_what_you_want='Y';
end;

run;

Hope this helps.

ChrisNZ
Tourmaline | Level 20

Like this?

 

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
run;
data _V/view=_V;
  set HAVE;
  by ID BUSINESS;
  VALUE=( first.BUSINESS * ACTSIZ);
run;
proc sql;
   select a.*, SUM
   from _V                                                   a
       ,(select ID, sum(VALUE) as SUM from _V group by 1)    b
   where a.ID=b.ID ;
run;

 

ID BUSINESS LOB ACTSIZ ACTSIZ
12340 01 RTY 23 23
12341 20 TYR 0 8
12341 30 1UY 8 8
12343 30 CBP 8 14
12343 30 MTS 8 14
12343 30 HPL 8 14
12343 45 HFI 6 14
12343 45 HFS 6 14
45673 01 RTY 40 64
45673 30 IUY 24 64
45673 30 CBP 24 64
45673 30 IUY 24 64
45673 30 CBP 24 64
Ksharp
Super User

data HAVE;  
length ID $ 5 BUSINESS $ 2 LOB $ 3;
input ID $ BUSINESS LOB $ ACTSIZ;
datalines;
12340 01 RTY 23
12341 20 TYR 0
12341 30 1UY 8
12343 30 CBP 8
12343 30 MTS 8
12343 30 HPL 8
12343 45 HFI 6
12343 45 HFS 6
45673 01 RTY 40
45673 30 IUY 24
45673 30 CBP 24
45673 30 IUY 24
45673 30 CBP 24
;
run;

proc sql;
select *
 from (select * from have ) as a left join
      (select id,sum(ACTSIZ) as sum_ACTSIZ from (select distinct id,ACTSIZ from have) group by id) as b
  on a.id=b.id;
quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 880 views
  • 0 likes
  • 4 in conversation