My apologies. Let me see if I'm able to state my question a little better.
A subset of my long dataset looks like this:
data have;
length name $20;
infile datalines dsd missover delimiter=",";
input age_grp $ gender $ name $ item $ count;
datalines;
0-13,FEMALE,FEB_2019_NEW ,FOOD,12
0-13,FEMALE,FEB_2019_TOTAL ,FOOD,23
0-13,FEMALE,FEB_2019_NEW ,GAS,45
0-13,FEMALE,FEB_2019_TOTAL ,GAS,76
0-13,FEMALE,FEB_2019_NEW ,TELE,4
0-13,FEMALE,FEB_2019_TOTAL ,TELE,5
0-13,MALE,FEB_2019_NEW ,TELE,3
0-13,MALE,FEB_2019_TOTAL ,TELE,8
0-13,FEMALE,MAR_2019_NEW ,TELE,0
0-13,FEMALE,MAR_2019_TOTAL ,TELE,0
;
run;
I want to turn the dataset into something like this:
data have;
length name $20;
infile datalines dsd missover delimiter=",";
input age_grp $ gender $ name $ item $ count want;
datalines;
0-13,FEMALE,FEB_2019_NEW ,FOOD,12,.5219
0-13,FEMALE,FEB_2019_TOTAL ,FOOD,23,
0-13,FEMALE,FEB_2019_NEW ,GAS,45,.592
0-13,FEMALE,FEB_2019_TOTAL ,GAS,76,
0-13,FEMALE,FEB_2019_NEW ,TELE,4,.80
0-13,FEMALE,FEB_2019_TOTAL ,TELE,5,
0-13,MALE,FEB_2019_NEW ,TELE,3,.375
0-13,MALE,FEB_2019_TOTAL ,TELE,8,
0-13,FEMALE,MAR_2019_NEW ,TELE,0,.
0-13,FEMALE,MAR_2019_TOTAL ,TELE,0,
;
run;
My BY variables include: Age_grp, Gender, Name, and Item.
So, for each month, I want to report the percent of NEW/TOTAL for each age group, gender, and item combination. So, for food purchases among female 0-13 in February 2019, it was 23 total purchases but the percent of those that were NEW were 52.19% (.5219).
I think a lag function might work but wasn't entirely sure how to do the grouping using your example. As you mentioned, I can guarantee that each row will alternate between New then Total.
Then to take it one step further, I wanted to concatenate the values for TOTAL and NEW to produce a value for each age group, gender, and item combination to read out the
TOTAL value (NEW pct). So, in the example above, it would be 23(52.2%).
... View more