Hi
I'm having a trouble with combining two or more rows with the same id into one. I have something like this:
ID | V1 | V2 |... |V10|
1 | str | |... | |
1 | |str2|... | |
But I'd like:
ID | V1 | V2|...|V10
1 | str |str2|...|
To sum up, I want to combine those two rows into single one like this one. But that's just the beginning. Imagine I have:
ID | V1 | V2 |... |V10|
1 | str | |... |str3 |
1 | |str2|... |str4 |
What I would need is:
ID | V1 | V2 |... |V10|V10v2|
1 | str |str2|... |str3|str4 |
And if there are more filled rows for a certain column, there should be version 3, 4, 5 and so on.
I'd be grateful for you help!
Thank you im advance!
Make it long first, so that you have a unique key for each row, which is a combination of ID and V1, V10V2 or however you want to count that.
i.e.:
ID Version Value
1 1 str1
1 2 str2
1 10 str3
1 10 str4
Then add a second identifier:
ID Version Counter Value
1 1 1 str1
1 2 1 str2
1 10 1 str3
1 10 2 str4
Then you can transpose it to a wide format again.
This is the most dynamic approach that I can think of and can be handled with a PROC TRANSPOSE, a DATA STEP to add the COUNTER variable and a second PROC TRANSPOSE to a wide format again. You may want to consider storing the data in a long format overall.
@Dontik, in your example, when same variable have more than one observation per ID,
as for: ID=1 V10 in (str3 str4) you want output in two variables: v10, v10v2.
Is the varaible name important or can it be v10, v11, etc. ?
How about input like:
ID | V1 | V2 |... |V10|
1 | str | |... |str3 |
1 | |str2|... |str4 |
1 | |str5|... |str6 |
should the varaibles be: v1, v2, v2v2, ...,v10, v10v2, v10v3 etc.
OR may I call them: v1, v2, v3, ... v10, v11, v12, ... Vn
- as much variables as need to include all values per ID ?
- is the order of values important? should it be:
str, str2, str5, strt3, str4, str6
OR str, str3, str2, str4, str5, str6
Can you give some more information what kind of data is it?
It may be a clue to give you a better solution.
Would you like next solution ?
data test;
input id version address $;
addr = cat('(',strip(version),') ',strip(address));
keep id addr;
datalines;
1 1 str1
1 2 str2
1 5 str3
1 5 str4
; run;
proc sort data=test;
by id addr;
run;
proc transpose data=test
out=test1(drop=_name_);
by id;
var addr;
run;
within this method you know what address kind is out of its value instead out of its label.
data want;
update have(obs=0) have;
by id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.