Hello
I have a data set where a customer can be in multiple rows.
The task is to create a new data set where each customer appear in one row .
I want to create a column with distinct values of version for each customer.
The desired data set will contain 2 fields: ID ,distinct_version
for customer 1 we will get value : 999,888
for customer 2 we will get value : 444
for customer 3 we will get value : 222,777
for customer4 we will get value : 888
Data rawdata;
input ID version y;
cards;
1 999 10
1 888 20
1 999 30
2 444 15
3 222 30
3 777 45
4 888 20
;
run;
You really want text strings like 999,888 with a comma separating the two original numbers?
What possible use could there be for data in this form?
Would it not be better to put 999 into a numeric column (let's name it col1) and then the 888 into another numeric column (let's name it col2)?
Data rawdata;
input ID version y;
cards;
1 999 10
1 888 20
1 999 30
2 444 15
3 222 30
3 777 45
4 888 20
;
run;
data want;
set rawdata;
by id;
length want $50;
retain want;
if first.id then want=put(version,8. -l);
else if find(want,strip(put(version,8.)))=0 then want=catx(',',want,version);
if last.id;
run;
Why on earth do you want data structured like that? 🙂
One of the reasons you are getting comments about combining values into a single variable relates to "how many" of these values are going to get stuffed into a single variable. And how will they be processed when a later data set comes up with yet another value.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.