Hey guys,
I have a dataset like as follows:
data work.class1;
input date $ cusip_hld $ value1 $ value2 $ value3$ value4 $;
datalines;
01DEC05 A 3 5 3 2 3
01DEC05 B 2 4 2 1 2
01JAN06 A 1 2 3 4 1
01JAN06 C 4 4 4 0 1
;
run;
Is there an opportunity to add a column "classValue" which is calculated as the mode for each row, i.e. the mode for each
date-cusip_hld-combination? (With mode I mean the value which appears most often, for example, the mode of 01JAN06 and Cusip_hld = C would be 4 in my table)
I found some general solutions here, but I have 2 specific issues which makes it hard for me to reproduce those.
First, it is possible (and often), that there are missing values in the value_i Columns. I only want to calculate the mode of the non-missing values in the row.
Second, the code is in a macro. In each Iteration, one column value_i is added. The number of value_i columns is saved in the macro variable &fundNumber (if that helps).
I´m really thankful of any suggestions or solutions.
Kind regards,
Zlatan
data work.class1;
input date $ cusip_hld $ value1 $ value2 $ value3$ value4 $;
datalines;
01DEC05 A 3 5 3 2 3
01DEC05 B 2 4 2 1 2
01JAN06 A 1 2 3 4 1
01JAN06 C 4 4 4 0 1
;
run;
data fin_mode (drop=k count i max ran);
if _n_ eq 1 then do;
declare hash ha();
declare hiter hi('ha');
ha.definekey('k');
ha.definedata('k','count');
ha.definedone();
declare hash random(ordered:'a');
declare hiter hir('random');
random.definekey('ran');
random.definedata('k','count');
random.definedone();
end;
set class1;
array x{*} value: ;
do i=1 to dim(x);
if not missing(x{i}) then do;
k=x{i};
if ha.find()=0 then do;count+1;ha.replace();end;
else do;count=1;ha.add();end;
end;
end;
do while(hi.next()=0);
ran=ranuni(-1);random.add();
end;
max=0;
do while(hir.next()=0);
if count gt max then do;max=count;mod=k;end;
end;
ha.clear(); random.clear();
run;
@mrzlatan91 hello mate!, long time. Please also post a sample of your wanted output for responders to know the exact requirement.
Anyways, I am going home now. Have a great weekend
data work.class1;
input date $ cusip_hld $ value1 $ value2 $ value3$ value4 $ value5 $;
datalines;
01DEC05 A 3 5 3 2 3
01DEC05 B 2 4 2 1 2
01JAN06 A 1 2 3 4 1
01JAN06 C 4 4 4 0 1
;
run;
proc transpose data= class1 out=t;
by date cusip_hld notsorted;
var value:;
run;
proc freq data=t order=freq noprint;
by date cusip_hld notsorted;
tables col1/out=t1(drop=percent) ;
run;
data want;
set t1;
by date cusip_hld notsorted;
if first.cusip_hld;
mode=col1;
run;
Hey mate,
first, I want to thank you so much for your solution.
Sorry that I forgot to post my wanted dataset, but I guess this is exactly what i wanted (if column mode contains the mode of the values)
Unfortunately, as I said, I have a lot of missing values, s.t. in column mode (and col1 as well) are only missing values, s.t. I can´t
proof if the result is correct.
Could you kindly tell me how to fix the problem with the missing values? Unfortunately, I´m an absolute SAS Noob.
To understand it: table T represents the transposed table, _name_ is like a counting variable and col1 contains the values.
Table T1 counts how often a specific value for a cusip-date combination exists.
And Table Want contains all date-cusip combinations, count represents how often the mode value exists for that
combination and mode represents the mode value, but what does column col1 then stand for?
Have a nice day mate.
@mrzlatan91 I'm glad you got a great solution by the champion @Ksharp whose code I tend to copy all the time for my needs. Going forward, please do post a wanted sample like your input sample to make it easy to understand.
data work.class1;
input date $ cusip_hld $ value1 $ value2 $ value3$ value4 $;
datalines;
01DEC05 A 3 5 3 2 3
01DEC05 B 2 4 2 1 2
01JAN06 A 1 2 3 4 1
01JAN06 C 4 4 4 0 1
;
run;
data fin_mode (drop=k count i max ran);
if _n_ eq 1 then do;
declare hash ha();
declare hiter hi('ha');
ha.definekey('k');
ha.definedata('k','count');
ha.definedone();
declare hash random(ordered:'a');
declare hiter hir('random');
random.definekey('ran');
random.definedata('k','count');
random.definedone();
end;
set class1;
array x{*} value: ;
do i=1 to dim(x);
if not missing(x{i}) then do;
k=x{i};
if ha.find()=0 then do;count+1;ha.replace();end;
else do;count=1;ha.add();end;
end;
end;
do while(hi.next()=0);
ran=ranuni(-1);random.add();
end;
max=0;
do while(hir.next()=0);
if count gt max then do;max=count;mod=k;end;
end;
ha.clear(); random.clear();
run;
Thank you very much. It works
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.