- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good day community .
i want to concatenate contents from the same column.
dataset (test
) looks like:
Customer_Num Product_Category_Desc
7293 | MUTUAL FUND INVESTMENTS |
7293 | NOTICE INVESTMENTS |
7293 | SERVICE PRODUCT |
7293 | UNCATEGORISED |
53916 | CARD BASED SAVINGS |
53916 | DEMAND DEPOSITS |
53916 | DEMAND INVESTMENTS |
53916 | FIXED INVESTMENTS |
53916 | SERVICE PRODUCT |
The output table (combined
) will be:
Customer_Num prod_id_combined
7293 | MUTUAL FUND INVESTMENTS, NOTICE INVESTMENTS, SERVICE PRODUCT, UNCATEGORISED |
53916 | CARD BASED SAVINGS, DEMAND DEPOSITS, DEMAND INVESTMENTS, FIXED INVESTMENTS, SERVICE PRODUCTcode to produce the expected outcome. proc sql; create table combined as select Customer_Num, catx(', ', Product_Category_Desc) as prod_id_combined from (select distinct Customer_Num, Product_Category_Desc from test) group by Customer_Num; quit; |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
proc sort data=have; by customer_num product_category_desc; run;
data want;
set have;
by customer_num product_category_desc;
length all_descrip $1000;
array T {100} $150 _temporary_;
if first.customer_num then do;
call missing(of T[*]);
recnum=0;
end;
if first.product_category_desc then do;
recnum+1;
T[recnum]=product_category_desc;
end;
if last.customer_num then all_descrip=catx(',', of T[*]);
keep customer_num all_descrip;
run;
If any customer has more than 100 unique products, then update the {100} part above to whatever that maximum number is. And if any of the product descriptions are more than 150 characters in length, update the $150 to that maximum length. Otherwise, you can safely leave as-is.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
May I ask why you need to do this? What is wrong with leaving the data as it is and working with it in the long arrangement?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
i want to have a view of a distinct customer based on all the products in one columns , i hope i have answered your question
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is that it, you just all the text concatenated for a customer, and that's the end of the project? Or are you planning to do analysis once you have that? If so, what types of analysis?
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As @PaigeMiller seems to be saying, though, the result of this is likely to be hard to use for any analysis. If you really want to collapse this to one row per person, then I would suggest you start by making a variable for each of your unique product descriptions (e.g., make variables like "mutual_fund_investments", etc.), then populate each of those variables with a 0 or 1 depending on whether the customer had that product.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
proc sort data=have; by customer_num product_category_desc; run;
data want;
set have;
by customer_num product_category_desc;
length all_descrip $1000;
array T {100} $150 _temporary_;
if first.customer_num then do;
call missing(of T[*]);
recnum=0;
end;
if first.product_category_desc then do;
recnum+1;
T[recnum]=product_category_desc;
end;
if last.customer_num then all_descrip=catx(',', of T[*]);
keep customer_num all_descrip;
run;
If any customer has more than 100 unique products, then update the {100} part above to whatever that maximum number is. And if any of the product descriptions are more than 150 characters in length, update the $150 to that maximum length. Otherwise, you can safely leave as-is.