Good Evening, I need some help.
I have a table which has customer, flag and count variables. Below is an example of this table.
customer | flag | count |
12345 | NON_COE | 2 |
12345 | COE | 5 |
For any given customer I might or might not have counts for ‘COE’ and ‘NON_COE’.
It could be any of below three scenarios.
Scenarios 1: Has both NON_COE and COE rows
customer | flag | count |
12345 | NON_COE | 2 |
12345 | COE | 5 |
Scenarios 2: Has a row only for NON_COE
customer | flag | count |
12345 | NON_COE | 2 |
Scenarios 3: Has a row only for COE
customer | flag | count |
12345 | COE | 5 |
I use above table to transpose data like below.
customer | Non_COE | COE |
12345 | 2 | 5 |
But in scenarios 2 or 3 where I get count for only either COE or NON_COE (not both), I get only one column, COE/NON_COE when I transpose data. I need to have both COE and NON_COE columns, even if there is no data. For example, in below scenario, where I get the counts only for COE,
customer | flag | count |
12345 | COE | 5 |
I would like to see transpose data as below.
customer | Non_COE | COE |
12345 | 0 | 5 |
I need help to get the output like above.
Thank you in advance
I would use PROC TRANSPOSE.
proc transpose data=have out=want(drop=_name_);
by customer;
id flag;
var count;
run;
Make sure the data is sorted by CUSTOMER and that the values of FLAG are valid SAS variable names.
If you are worried that some of the possible values of FLAG do not exist (and hence will not become variables in the new dataset) then add a step to make sure the variables are created. This will also allow you to force the order of the variables. And perhaps create the variable names using mixed case letters like your desired result.
data want;
length customer Non_COE COE 8;
set want;
run;
If your data include one customer has both NON_COE and COE rows, that would not be a problem, sas would generate two columns as you wanted.
But if your data include only NON_COE or COE row in each customer, that is another story, you need to pad both NON_COE and COE rows in it.
data have;
input customer flag $ count;
cards;
12345 NON_COE 2
12346 COE 5
;
data flag;
input flag $;
cards;
NON_COE
COE
;
proc sql;
create table have2 as
select a.*,coalesce(b.count,0) as count
from
(
select * from
(select distinct customer from have),
(select distinct flag from flag)
) as a natural left join have as b
;
quit;
proc transpose data=have2 out=want;
by customer;
var count;
id flag;
run;
Since you know the names of the target variables, a simple DATA step will do it:
data have;
input customer $ flag $ count;
datalines;
12345 COE 5
;
data want;
set have;
length coe non_coe 8;
coe = 0;
non_coe = 0;
select (flag);
when ("COE") coe = count;
when ("NON_COE") non_coe = count;
otherwise putlog "ERROR: non-expected flag!";
end;
drop flag count;
run;
But do you really want that wide dataset, or is this for reporting purposes? What is your final goal, and what is your initial input data?
I ask this to avoid having an XY problem.
Nearly 200 sessions are now available on demand in the Innovate Hub.
Watch Now →Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.