Hi
I'm having a trouble with combining two or more rows with the same id into one. I have something like this:
ID | V1 | V2 |... |V10|
1 | str | |... | |
1 | |str2|... | |
But I'd like:
ID | V1 | V2|...|V10
1 | str |str2|...|
To sum up, I want to combine those two rows into single one like this one. But that's just the beginning. Imagine I have:
ID | V1 | V2 |... |V10|
1 | str | |... |str3 |
1 | |str2|... |str4 |
What I would need is:
ID | V1 | V2 |... |V10|V10v2|
1 | str |str2|... |str3|str4 |
And if there are more filled rows for a certain column, there should be version 3, 4, 5 and so on.
I'd be grateful for you help!
Thank you im advance!
Make it long first, so that you have a unique key for each row, which is a combination of ID and V1, V10V2 or however you want to count that.
i.e.:
ID Version Value
1 1 str1
1 2 str2
1 10 str3
1 10 str4
Then add a second identifier:
ID Version Counter Value
1 1 1 str1
1 2 1 str2
1 10 1 str3
1 10 2 str4
Then you can transpose it to a wide format again.
This is the most dynamic approach that I can think of and can be handled with a PROC TRANSPOSE, a DATA STEP to add the COUNTER variable and a second PROC TRANSPOSE to a wide format again. You may want to consider storing the data in a long format overall.
@Dontik, in your example, when same variable have more than one observation per ID,
as for: ID=1 V10 in (str3 str4) you want output in two variables: v10, v10v2.
Is the varaible name important or can it be v10, v11, etc. ?
How about input like:
ID | V1 | V2 |... |V10|
1 | str | |... |str3 |
1 | |str2|... |str4 |
1 | |str5|... |str6 |
should the varaibles be: v1, v2, v2v2, ...,v10, v10v2, v10v3 etc.
OR may I call them: v1, v2, v3, ... v10, v11, v12, ... Vn
- as much variables as need to include all values per ID ?
- is the order of values important? should it be:
str, str2, str5, strt3, str4, str6
OR str, str3, str2, str4, str5, str6
Can you give some more information what kind of data is it?
It may be a clue to give you a better solution.
Would you like next solution ?
data test;
input id version address $;
addr = cat('(',strip(version),') ',strip(address));
keep id addr;
datalines;
1 1 str1
1 2 str2
1 5 str3
1 5 str4
; run;
proc sort data=test;
by id addr;
run;
proc transpose data=test
out=test1(drop=_name_);
by id;
var addr;
run;
within this method you know what address kind is out of its value instead out of its label.
data want;
update have(obs=0) have;
by id;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.