Hello
I have a data set where a customer can be in multiple rows.
The task is to create a new data set where each customer appear in one row .
I want to create a column with distinct values of version for each customer.
The desired data set will contain 2 fields: ID ,distinct_version
for customer 1 we will get value : 999,888
for customer 2 we will get value : 444
for customer 3 we will get value : 222,777
for customer4 we will get value : 888
Data rawdata;
input ID version y;
cards;
1 999 10
1 888 20
1 999 30
2 444 15
3 222 30
3 777 45
4 888 20
;
run;
You really want text strings like 999,888 with a comma separating the two original numbers?
What possible use could there be for data in this form?
Would it not be better to put 999 into a numeric column (let's name it col1) and then the 888 into another numeric column (let's name it col2)?
Data rawdata;
input ID version y;
cards;
1 999 10
1 888 20
1 999 30
2 444 15
3 222 30
3 777 45
4 888 20
;
run;
data want;
set rawdata;
by id;
length want $50;
retain want;
if first.id then want=put(version,8. -l);
else if find(want,strip(put(version,8.)))=0 then want=catx(',',want,version);
if last.id;
run;
Why on earth do you want data structured like that? 🙂
One of the reasons you are getting comments about combining values into a single variable relates to "how many" of these values are going to get stuffed into a single variable. And how will they be processed when a later data set comes up with yet another value.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.