Hi All,
Please help, i am stuck with my code.
Source table:
Id | Gender | Marks |
1 | F | Eng lish |
1 | F | |
2 | M | Maths |
2 | M | English |
2 | M | Science |
2 | M | |
3 | F | Maths |
4 | M | Eng lish |
4 | M | Maths |
4 | M | |
4 | M | Social |
Required output:
Id | Gender | Eng_lish | Maths | Science | Social |
1 | Y | ||||
2 | Y | Y | Y | ||
3 | Y | ||||
4 | Y | Y | Y |
I am using transpose and sas macros to get the above output.
ID | NAME OF FORMER | COL1 | COL2 | COL3 | COL4 | COL5 |
VARIABLE | ||||||
1 | Marks_new | Eng lish | ||||
2 | Marks_new | Maths | Eng lish | Science | ||
3 | Marks_new | Maths | ||||
4 | Marks_new | Maths | Eng lish | Maths | Social |
After transposing, counting COL1 - COL..n to do the loop. I am using COL count and variable count to do the loop. Since variable names are having space i am using tranwrd to replace space with '_' and assigning original value as Label.
My SAS code:
Data t;
input ID Gender $ Marks $50.;
cards;
1 F Eng lish
1 F
2 M Maths
2 M Eng lish
2 M Science
2 M
3 F Maths
4 M Maths
4 M Eng lish
4 M Maths
4 M
4 M Social
;
data tt;
set t;
Marks_new=compress(Marks, "");
run;
PROC TRANSPOSE DATA=tt OUT=t1;
BY id;
VAR Marks_new;
RUN;
proc contents data=t1 out=list (keep=Name);
run;
proc sql;
select count(compress(name,,'kd')) into: ct from list where name ne"";
select count(distinct Marks) into: var_ct from tt where Marks ne "";
select distinct Marks_new into: var1-: var%eval(&var_ct) from tt where Marks ne "";
select distinct Marks into: label1-: label%eval(&var_ct) from tt where Marks ne "";
quit;
%put &lqbel1 &var1;
%macro m();
%if &ct. > &var_ct. %then %do;
data my (drop = Col1-&Col%eval(&ct);
set t1;
%DO i=1 %TO &ct.;
%IF COL&i= &&var&i.. %THEN %cmpres(&&var&i..)='Y';
&&var&i.. label = &&label&i..;
%end;
run;
%end;
%else &ct. < &var_ct. %then %do;
data my (drop = Col1-&Col%eval(&var_ct);
set t1;
%DO i=1 %TO &var_ct.;
%IF COL&i= &&var&i.. %THEN %cmpres(&&var&i..)='Y';
&&var&i.. label = &&label&i..;
%end;
run;
%end;
%mend;
%m();
Many thanks in advance.
Regards,
Raj
Slightly simplified:
Data t; input ID Gender $ Marks $50.; val='Y'; cards; 1 F English 1 F 2 M Maths 2 M English 2 M Science 2 M 3 F Maths 4 M Maths 4 M English 4 M Maths 4 M 4 M Social ; PROC TRANSPOSE DATA=t OUT=t1 let; BY id; id marks; VAR val; RUN;
though I would recommend using a numeric 1 instead of "Y" .
The above code will generate warning in the log about "ID value repeats for by groups" and observations dropped for missing ID values.
And are your kind of odd records of id and gender with no marks come from some external source that has that as a summary row or similar? If you don't need those records I recommend dropping them as soon as practical.
If it's a report you're after then below should do.
Data have;
input ID Gender $ Marks $50.;
Marks_new=compress(Marks);
cards;
1 F Eng lish
1 F
2 M Maths
2 M Eng lish
2 M Science
2 M
3 F Maths
4 M Maths
4 M Eng lish
4 M Maths
4 M
4 M Social
;
run;
proc format;
value mark_yn
low-high='Y'
other=' '
;
run;
proc tabulate data=have;
class id gender Marks_new;
keylabel n=' ';
table id*gender, marks_new=' '*f=mark_yn.;
run;
Thanks a lot Patrick. Output looks perfect. But, only problem is I need to save this output and do further analysis. Proc tabulate output is not in desired shape with out option.
Kindly suggest.
You need to aggregate the data first, and then use Proc TRANSPOSE, see example below. I used the original "Eng lish" value, with the System option VALIDVARNAME=V7 you can ensure, that any special chars for varibales names are translated into and underscore.
Is there a reason to have lines where marks has no value?
Data have;
input ID Gender $ Marks $50.;
Marks_new=compress(Marks);
cards;
1 F Eng lish
1 F
2 M Maths
2 M Eng lish
2 M Science
2 M
3 F Maths
4 M Maths
4 M Eng lish
4 M Maths
4 M
4 M Social
;
run;
proc sql;
create table have2 as
select
id
, gender
, marks
, case
when count(*) >= 1 then "Y"
else " "
end as indicator
from
have
group by
id
, gender
, marks
;
quit;
options validvarname=v7;
proc transpose
data=have2
out=want(
drop=_name_
)
;
by id gender;
id marks;
var indicator;
run;
What kind of analysis you plan to do with the transposed data? The original format seems to be better suited for any kind of analysis i can think of.
Slightly simplified:
Data t; input ID Gender $ Marks $50.; val='Y'; cards; 1 F English 1 F 2 M Maths 2 M English 2 M Science 2 M 3 F Maths 4 M Maths 4 M English 4 M Maths 4 M 4 M Social ; PROC TRANSPOSE DATA=t OUT=t1 let; BY id; id marks; VAR val; RUN;
though I would recommend using a numeric 1 instead of "Y" .
The above code will generate warning in the log about "ID value repeats for by groups" and observations dropped for missing ID values.
And are your kind of odd records of id and gender with no marks come from some external source that has that as a summary row or similar? If you don't need those records I recommend dropping them as soon as practical.
@ballardw: Sure, as you suggested I will remove duplicate values before transposing.
Thanks a lot to all 🙂
It was my first, your inputs are very helpful. I have learnt different approaches for a problem.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.