BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Rock3t4u
Calcite | Level 5

I have a dataset as below. I have the condition that if [(given<4) and (taken>=1)] then I have to sum up(collapse cells) for given and taken variables by grouping into company, area, division .

companyareadivisionstategiventaken
xxxMWMW_1MP37
xxxMWMW_1UP83
xxxES_1MN54
xxxNENE_1AP21
xxxNENE_1TN11
xxxNENE_2TL62
yyyMWMW_1MP34
yyyMWMW_1OH27

 

The result table should be as below for collapsed 1. I have no clue how to code this in SAS. 

companyareadivisiongeographygiventaken
xxxMWMW_1MP_UP1110
xxxES_1MN54
xxxNENE_1AP_TN32
xxxNENE_2TL62
yyyMWMW_1MP_OH5

11

1 ACCEPTED SOLUTION

Accepted Solutions
AhmedAl_Attar
Ammonite | Level 13

Hi @Rock3t4u 

Note: Based on your collapsing condition, you should end up with 6 records in the output data set based on your supplied sample!

xxx NE NE_1 AP 2 1
xxx NE NE_1 TN 1 1

These two records do not meet the [(given<4) and (taken>1)] condition, therefore it shouldn't collapse!

 

I've attached a modified version of my collapse_data.sas macro, you can download it and save it, then run the following code

 

/* Create sample data */
data have;
infile datalines dlm="09"x dsd truncover;
input company $ area $ division $ state $ given taken;
datalines;
xxx	MW	MW_1	MP	3	7
xxx	MW	MW_1	UP	8	3
xxx	E	S_1	MN	5	4
xxx	NE	NE_1	AP	2	1
xxx	NE	NE_1	TN	1	1
xxx	NE	NE_2	TL	6	2
yyy	MW	MW_1	MP	3	4
yyy	MW	MW_1	OH	2	7
;
run;

/* Ensure the Input data set is sorted by the desired classification variables */
PROC SORT DATA=work.HAVE;
	BY company area division state;
RUN;

%include "<SomeDirectory>/collapse_data.sas";

/* Sample macro execution call */
%collapse_data (p_inDsName=work.have
, p_outDsName=work.want
, p_classVars=%STR(company area division state)
, p_prntVar=division
, p_sumedVar=state
, p_groupedVar=Geography
, p_minLimitCondition=%str(given LT 4 AND taken GT 1));

The code should produce the following want data set

 

company area division given taken Geography
xxx E S_1 5 4 MN
xxx MW MW_1 11 10 UP_MP
xxx NE NE_1 2 1 AP
xxx NE NE_1 1 1 TN
xxx NE NE_2 6 2 TL
yyy MW MW_1 5 11 OH_MP

 

Hope this helps,

Ahmed

View solution in original post

9 REPLIES 9
AhmedAl_Attar
Ammonite | Level 13

Hi @Rock3t4u 

Few years ago I developed a SAS macro to do exactly that 😉

Check it out here Using a SAS® Hash Object to Speed and Simplify the Survey Cell Collapsing Process

 

Hope this helps,

Ahmed

Rock3t4u
Calcite | Level 5
Hi Ahmed, Thank you for the reply. I updated the condition.
AhmedAl_Attar
Ammonite | Level 13
@Rock3t4u
I'm interested in getting your feedback on how the macro performs when you use it with your huge dataset. Just make sure you set the correct -MEMSIZE SAS invocation option, in order to provide enough memory to load the data into the Hash Object.

Ahmed
Kurt_Bremser
Super User
data have;
infile datalines dlm="09"x dsd truncover;
input company $ area $ division $ state $ given taken;
datalines;
xxx	MW	MW_1	MP	3	7
xxx	MW	MW_1	UP	8	3
xxx	E	S_1	MN	5	4
xxx	NE	NE_1	AP	2	1
xxx	NE	NE_1	TN	1	1
xxx	NE	NE_2	TL	6	2
yyy	MW	MW_1	MP	3	4
yyy	MW	MW_1	OH	2	7
;

data want;
set have (rename=(state=_state given=_given taken=_taken));
by company area notsorted division;
length state $20;
retain state;
if first.division
then do;
  state = _state;
  given = _given;
  taken = _taken;
end;
else do;
  state = catx("_",state,_state);
  given + _given;
  taken + _taken;
end;
if last.division;
drop _:;
run;

proc print data=want noobs;
run;

Result:

company	area	division	state	given	taken
xxx	MW	MW_1	MP_UP	11	10
xxx	E	S_1	MN	5	4
xxx	NE	NE_1	AP_TN	3	2
xxx	NE	NE_2	TL	6	2
yyy	MW	MW_1	MP_OH	5	11
Rock3t4u
Calcite | Level 5

Thanks for the reply. But the collapsing should be based on the condition[(given<4) and (taken>1)]. I will be dealing with huge dataset and collapsing should be only based on condition. 

Kurt_Bremser
Super User

My code creates your expected result from the data you posted.

If your intended result was not correct, correct it. If additional data is needed to illustrate your issue, post it.

AhmedAl_Attar
Ammonite | Level 13

Hi @Rock3t4u 

Note: Based on your collapsing condition, you should end up with 6 records in the output data set based on your supplied sample!

xxx NE NE_1 AP 2 1
xxx NE NE_1 TN 1 1

These two records do not meet the [(given<4) and (taken>1)] condition, therefore it shouldn't collapse!

 

I've attached a modified version of my collapse_data.sas macro, you can download it and save it, then run the following code

 

/* Create sample data */
data have;
infile datalines dlm="09"x dsd truncover;
input company $ area $ division $ state $ given taken;
datalines;
xxx	MW	MW_1	MP	3	7
xxx	MW	MW_1	UP	8	3
xxx	E	S_1	MN	5	4
xxx	NE	NE_1	AP	2	1
xxx	NE	NE_1	TN	1	1
xxx	NE	NE_2	TL	6	2
yyy	MW	MW_1	MP	3	4
yyy	MW	MW_1	OH	2	7
;
run;

/* Ensure the Input data set is sorted by the desired classification variables */
PROC SORT DATA=work.HAVE;
	BY company area division state;
RUN;

%include "<SomeDirectory>/collapse_data.sas";

/* Sample macro execution call */
%collapse_data (p_inDsName=work.have
, p_outDsName=work.want
, p_classVars=%STR(company area division state)
, p_prntVar=division
, p_sumedVar=state
, p_groupedVar=Geography
, p_minLimitCondition=%str(given LT 4 AND taken GT 1));

The code should produce the following want data set

 

company area division given taken Geography
xxx E S_1 5 4 MN
xxx MW MW_1 11 10 UP_MP
xxx NE NE_1 2 1 AP
xxx NE NE_1 1 1 TN
xxx NE NE_2 6 2 TL
yyy MW MW_1 5 11 OH_MP

 

Hope this helps,

Ahmed

Rock3t4u
Calcite | Level 5

In the macro mentioned https://support.sas.com/resources/papers/proceedings16/7820-2016.pdf, I do not see the parameters p_sumedvar and p_groupedvar  included. Can you post the macro collapse_data you used to get the below result. Thanks.

collapse_data

 

AhmedAl_Attar
Ammonite | Level 13
I've already included it as as an attachment in my reply. The version I attached was a modified/customized to meet your data conditions and layout

sas-innovate-white.png

🚨 Early Bird Rate Extended!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Lock in the best rate now before the price increases on April 1.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2090 views
  • 0 likes
  • 3 in conversation