I posted my question earlier, but maybe I didn't state my question in a neat way, so I post it again.
The data I have is like the following,
data have;
input item_id $ resp;
datalines;
pp1111 1
pp1111 2
pp1111 3
pp2222 1
pp2222 2
pp2222 3
pp2222 4
pp3333 1
pp3333 3
pp3333 4
pp4444 1
pp4444 2
pp5555 1
;
I want each item_id have four rows, the reformatted data I want should be look like the following
data want;
input item_id $ resp;
datalines;
pp1111 1
pp1111 2
pp1111 3
pp1111
pp2222 1
pp2222 2
pp2222 3
pp2222 4
pp3333 1
pp3333
pp3333 3
pp3333 4
pp4444 1
pp4444 2
pp4444
pp4444
pp5555 1
pp5555
pp5555
pp5555
;
Any help would be much appreciated!
Hi @superbug
data have;
input item_id $ resp;
datalines;
pp1111 1
pp1111 2
pp1111 3
pp2222 1
pp2222 2
pp2222 3
pp2222 4
pp3333 1
pp3333 3
pp3333 4
pp4444 1
pp4444 2
pp5555 1
;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("resp") ;
h.definedone () ;
end;
do until(last.item_id);
set have;
by item_id;
if resp then h.add();
end;
do _n_=1 to 4;
if h.find(key:_n_) then call missing(resp);
output;
end;
h.clear();
run;
Hi @superbug
data have;
input item_id $ resp;
datalines;
pp1111 1
pp1111 2
pp1111 3
pp2222 1
pp2222 2
pp2222 3
pp2222 4
pp3333 1
pp3333 3
pp3333 4
pp4444 1
pp4444 2
pp5555 1
;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("resp") ;
h.definedone () ;
end;
do until(last.item_id);
set have;
by item_id;
if resp then h.add();
end;
do _n_=1 to 4;
if h.find(key:_n_) then call missing(resp);
output;
end;
h.clear();
run;
It seems I was happy too early. In my data I also have "freq" and"pct" columns, using your code, it turned out, for each item, the "freq" and "pct" columns picked the value in last row of each item. I tried to modify your code, but there is error. Could you please modify your code to fit my purpose?
my data have is like the following
data have;
input item_id $ resp freq pct;
datalines;
pp1111 1 2 0.3
pp1111 2 5 0.5
pp1111 3 6 0.7
pp2222 1 3 0.4
pp2222 2 2 0.2
pp2222 3 4 0.5
pp2222 4 3 0.4
pp3333 1 6 0.8
pp3333 3 4 0.5
pp3333 4 10 0.7
pp4444 1 12 0.6
pp4444 2 9 0.5
pp5555 1 7 0.3
;
I want it to be like this
data want;
input item_id $ resp freq pct;
datalines;
pp1111 1 2 0.3
pp1111 2 5 0.5
pp1111 3 6 0.7
pp1111
pp2222 1 3 0.4
pp2222 2 2 0.2
pp2222 3 4 0.5
pp2222 4 3 0.4
pp3333 1 6 0.8
pp3333
pp3333 3 4 0.5
pp3333 4 10 0.7
pp4444 1 12 0.6
pp4444 2 9 0.5
pp4444
pp4444
pp5555 1 7 0.3
pp5555
pp5555
pp5555
;
thanks a bunch!
Hi @superbug Just a minor tweak. Please see if this helps-
data have;
input item_id $ resp freq pct;
datalines;
pp1111 1 2 0.3
pp1111 2 5 0.5
pp1111 3 6 0.7
pp2222 1 3 0.4
pp2222 2 2 0.2
pp2222 3 4 0.5
pp2222 4 3 0.4
pp3333 1 6 0.8
pp3333 3 4 0.5
pp3333 4 10 0.7
pp4444 1 12 0.6
pp4444 2 9 0.5
pp5555 1 7 0.3
;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("resp") ;
h.definedata("resp","freq","pct");
h.definedone () ;
end;
do until(last.item_id);
set have;
by item_id;
if resp then h.add();
end;
do _n_=1 to 4;
if h.find(key:_n_) then call missing(resp,freq,pct);
output;
end;
h.clear();
run;
Yes, it worked! Any difficulty problem for me is just a piece of cake for you.
As always, thank you so much for the help!
proc freq data=have noprint;
table item_id * resp / sparse out=temp;
run;
data want;
set temp;
*removes resp from records that have a 0 count;
if count = 0 then call missing(resp);
drop count percent;
run;
Another common way is to create a skeleton table with all your possible combinations and merge your input data.
/*SQL approach*/
*build skeleton table;
proc sql;
create table skeletonTable as
select distinct t1.item_id, t2.resp
from have as t1, have as t2;
quit;
*merge with table (prior step combined);
proc sql;
create table want as
select t4.item_id, t3.resp
from have as t3 right join
(select distinct t1.item_id, t2.resp
from have as t1, have as t2) as t4
on t3.item_id=t4.item_id and t3.resp=t4.resp;
quit;
@superbug wrote:
I posted my question earlier, but maybe I didn't state my question in a neat way, so I post it again.
The data I have is like the following,
data have; input item_id $ resp; datalines; pp1111 1 pp1111 2 pp1111 3 pp2222 1 pp2222 2 pp2222 3 pp2222 4 pp3333 1 pp3333 3 pp3333 4 pp4444 1 pp4444 2 pp5555 1 ;I want each item_id have four rows, the reformatted data I want should be look like the following
data want; input item_id $ resp; datalines; pp1111 1 pp1111 2 pp1111 3 pp1111 pp2222 1 pp2222 2 pp2222 3 pp2222 4 pp3333 1 pp3333 pp3333 3 pp3333 4 pp4444 1 pp4444 2 pp4444 pp4444 pp5555 1 pp5555 pp5555 pp5555 ;Any help would be much appreciated!
One way that assumes your data is sorted by Item_id and resp. If the data is not sorted by item_id but only grouped then you would add the NOTSORTED option to the BY item_id statement.
data want;
set have;
by item_id;
output;
if last.item_id and resp< 4 then do;
counter=resp;
resp=.;
do until (counter=4) ;
output;
counter=counter+1;
end;
end;
drop counter;
run;
You should consider changing the use of the word "format" to discuss data set contents or structure. In SAS a Format specifically refers to the display characteristics for a variable and can lead to some confusion when used in some contexts.
Thank you all fantastic SAS experts! You saved my day. A big thumb up to you all!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.