Can some one help me to get below ouput ?
left side iam expecting
right side is the input
trying to shift cells up and keep id multiple rows when multiple cells
Input | output | ||||||||||||
id | col1 | col2 | col3 | col4 | id | col1 | col2 | col3 | col4 | ||||
1 | abc | 1 | abc | abc2 | abc3 | ||||||||
1 | abc2 | abc3 | 2 | abz | xyz | hjds | |||||||
2 | abz | 3 | wer | sdsf | ere | rtr | |||||||
2 | xyz | hjds | 3 | erew | fgt | ||||||||
3 | wer | rtr | 4 | erer | bgng | tyt | rgrgr | ||||||
3 | ere | fgt | rtr | ||||||||||
3 | sdsf | erew | rgrgr | ||||||||||
4 | erer | bgng | tyt | rtr |
Are you sure the left side is the OUTPUT and the right side was the INPUT? Why would you want to expand the number of observations? How did you decide that isolated value in the column of the last tow belonged with some of the other values instead of being left as its own row. It does not have an ID value associated with it.
What are the rules that lets you know when to change the number of observations?
Can you explain them in terms of the values of the variables?
Sorry right side is the output anf left side is the input
How many col: variables do you have in your real dataset?
One possible approach:
proc transpose data=have out=long1 (rename=(col1=value) where=(value ne ""));
by id;
var col:;
run;
proc sort data=long1;
by id _name_;
run;
data long2;
set long1;
by id _name_;
if firs._name
then count = 1;
else count + 1;
run;
proc sort data=long2;
by id count;
run;
proc transpose data=long2 out=want (drop=_name_ count);
by id count;
var value;
id _name_;
run;
Untested, posted from my tablet.
You will have a better chance of getting a good answer if you post the input data as data step code, e.g.:
data have;
infile cards4 dsd delimiter=';';
input id col1 $ col2 $ col3 $ col4 $;
cards4;
1;abc; ; ; ;
1; ;abc2;abc3; ; ;
2;abz; ; ; ;
2; ;xyz;hjds;
3;wer; ; ;rtr;
3; ; ;ere;fgt;
3; ;sdsf;erew;rgrgr;
4;erer;bgng;tyt;rtr;
;;;;run;
I think you can get the output you want like this:
data want;
if 0 then set have;
call missing(of col1-col4);
merge
have(where=(col1>' ') keep=id col1)
have(where=(col2>' ') keep=id col2)
have(where=(col3>' ') keep=id col3)
have(where=(col4>' ') keep=id col4)
;
by id;
run;
- the CALL MISSING statement is necessary because the variables are retained within the BY group in the MERGE statement - the first statement is just to initialize variable types before that.
@mmkr , it looks to me like @s_lassen 's lovely code is correct for your intent.
But in the desired output shown in your question, it looks like there is a mistake for the last two rows. Søren's code returns:
Obs id col1 col2 col3 col4 1 1 abc abc2 abc3 2 2 abz xyz hjds 3 3 wer sdsf ere rtr 4 3 erew fgt 5 3 rgrg 6 4 erer bgng tyt rtr
Can you confirm that is the correct output you want?
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.