Good day community .
i want to concatenate contents from the same column.
dataset (test
) looks like:
Customer_Num Product_Category_Desc
7293 | MUTUAL FUND INVESTMENTS |
7293 | NOTICE INVESTMENTS |
7293 | SERVICE PRODUCT |
7293 | UNCATEGORISED |
53916 | CARD BASED SAVINGS |
53916 | DEMAND DEPOSITS |
53916 | DEMAND INVESTMENTS |
53916 | FIXED INVESTMENTS |
53916 | SERVICE PRODUCT |
The output table (combined
) will be:
Customer_Num prod_id_combined
7293 | MUTUAL FUND INVESTMENTS, NOTICE INVESTMENTS, SERVICE PRODUCT, UNCATEGORISED |
53916 | CARD BASED SAVINGS, DEMAND DEPOSITS, DEMAND INVESTMENTS, FIXED INVESTMENTS, SERVICE PRODUCTcode to produce the expected outcome. proc sql; create table combined as select Customer_Num, catx(', ', Product_Category_Desc) as prod_id_combined from (select distinct Customer_Num, Product_Category_Desc from test) group by Customer_Num; quit; |
Try this:
proc sort data=have; by customer_num product_category_desc; run;
data want;
set have;
by customer_num product_category_desc;
length all_descrip $1000;
array T {100} $150 _temporary_;
if first.customer_num then do;
call missing(of T[*]);
recnum=0;
end;
if first.product_category_desc then do;
recnum+1;
T[recnum]=product_category_desc;
end;
if last.customer_num then all_descrip=catx(',', of T[*]);
keep customer_num all_descrip;
run;
If any customer has more than 100 unique products, then update the {100} part above to whatever that maximum number is. And if any of the product descriptions are more than 150 characters in length, update the $150 to that maximum length. Otherwise, you can safely leave as-is.
May I ask why you need to do this? What is wrong with leaving the data as it is and working with it in the long arrangement?
i want to have a view of a distinct customer based on all the products in one columns , i hope i have answered your question
Is that it, you just all the text concatenated for a customer, and that's the end of the project? Or are you planning to do analysis once you have that? If so, what types of analysis?
As @PaigeMiller seems to be saying, though, the result of this is likely to be hard to use for any analysis. If you really want to collapse this to one row per person, then I would suggest you start by making a variable for each of your unique product descriptions (e.g., make variables like "mutual_fund_investments", etc.), then populate each of those variables with a 0 or 1 depending on whether the customer had that product.
Try this:
proc sort data=have; by customer_num product_category_desc; run;
data want;
set have;
by customer_num product_category_desc;
length all_descrip $1000;
array T {100} $150 _temporary_;
if first.customer_num then do;
call missing(of T[*]);
recnum=0;
end;
if first.product_category_desc then do;
recnum+1;
T[recnum]=product_category_desc;
end;
if last.customer_num then all_descrip=catx(',', of T[*]);
keep customer_num all_descrip;
run;
If any customer has more than 100 unique products, then update the {100} part above to whatever that maximum number is. And if any of the product descriptions are more than 150 characters in length, update the $150 to that maximum length. Otherwise, you can safely leave as-is.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.