BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
_bhishek
Calcite | Level 5

I have 23 variable and i want all those values which are coming in all the variable ( it doesn't matter that all should be in the same row ).

For Example:

var1 var2 var3 var4 var5

23    87    94      59   54

36    23    19      23   57

35    38     93    51     23

68    49     23    78     99  

So  we can see that all the variable have 23 as a value so i want 23 as answer .

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One way. The 5 is the variable count. I'll leave that as a nearly trivial exercise for the interested reader to get the value of the number of variables involved into a variable as needed.

data have;
input var1 var2 var3 var4 var5;
datalines;
23    87    94      59   54
36    23    19      23   57
87    63     87    51    99      
35    38     87    87     23
68    49     23    78     87
;
run;

data trans (keep= name value);
   set have;
   array v var1 var2 var3 var4 var5;
   length name $ 32;
   do j = 1 to dim(v);
      value = v;
      name =  vname(v);
      output;
   end;
run;

proc sql;
   select value
   from (
         select value, count(*) as vcount
         from (select distinct value,name from trans)
         group by value
        )
   where vcount=5;
   ;
quit;

View solution in original post

13 REPLIES 13
ballardw
Super User

Please show an example of what you expect the output to look like. What if you have 2 or more values with duplicate values? Do you want to know how many matches?

_bhishek
Calcite | Level 5

I want ALL the values which are present in ALL the variables. I do want how many matches are there.

For Example:

var1 var2 var3 var4 var5

23    87    94      59   54

36    23    19      23   57

87    63     87    51    99      

35    38     87    87     23

68    49     23    78     87

like in this example 23 and 87 are present in all the variable so i want output as 23 87.

I think it's clear now if not you can ask me again and thanks for the reply ballardw.

ballardw
Super User

One way. The 5 is the variable count. I'll leave that as a nearly trivial exercise for the interested reader to get the value of the number of variables involved into a variable as needed.

data have;
input var1 var2 var3 var4 var5;
datalines;
23    87    94      59   54
36    23    19      23   57
87    63     87    51    99      
35    38     87    87     23
68    49     23    78     87
;
run;

data trans (keep= name value);
   set have;
   array v var1 var2 var3 var4 var5;
   length name $ 32;
   do j = 1 to dim(v);
      value = v;
      name =  vname(v);
      output;
   end;
run;

proc sql;
   select value
   from (
         select value, count(*) as vcount
         from (select distinct value,name from trans)
         group by value
        )
   where vcount=5;
   ;
quit;

_bhishek
Calcite | Level 5

Thanks ballardw your answer was correct but is there any other way to do it without transposing it because in actual problem i have 25 variables and 1 lakh observations so when i will transpose it. then i will have 25 Lakh observation so will you suggest me to do the same way for my problem.

ballardw
Super User

Increase the variables from VAR1  to VAR5 to be the names of all 25 variables.

I don't know what a lakh may be, but the code should run. What is the concern about the number of observations?

ebills
Fluorite | Level 6

Are you wanting to know what number all 23 variables have in common? So, do you want your output to just show what number appears in EVERY column?

var1 var2 var3 ...

23     23     23  etc...

_bhishek
Calcite | Level 5

Yes ebills , I want my output to show all the numbers which are in every column.

evp000
Quartz | Level 8

data x1;

    length var1 var2 var3 var4 var5 8;

    input var1 var2 var3 var4 var5;

datalines;

23    87    94      59   54

36    23    19      23   57

87    63     87    51    99     

35    38     87    87     23

68    49     23    78     87

run;

%let nvars=5;

%macro check;

%do i = 01 %to 99;

    %let found = N;

    %global found&i;

    data found&i;

        set x1 end = at_end;

        retain flag1-flag5 0;  /* set to 0 for each record */

        array vars {*} var1-var&nvars;

        array flags {*} flag1-flag&nvars;

        checkvar = &i;

        do i = 1 to &nvars; /* for the first value in the first record, 23, check this against the var2, 3, 4 and 5 in the same record */

            if checkvar = vars{i} then flags{i} = 1;  /* do not sum, just flag if any */

        end;

        if at_end and sum(of flag1-flag5) = &nvars then  /* if found in all, then each flag will be 1 and the total will be 5 */

        do;

            put checkvar ' EXISTS IN ALL COLUMNS. ';

            in_all = checkvar;

            output found&i;

            call symputx("found&i", 'Y');  /* macro variable set to Y if found in all vars */

        end;

        else call symputx("found&i",'N');

    run;

    %put found&i=&&found&i;

%end;

%mend;

%check;

   

%macro put_together;

    data all_found;  /* create empty dataset */

        stop;

    run;

    %do j = 01 %to 99;

        %*put found&j=&&found&j;

        %if &&found&j = Y %then          /* append if there are any */

        %do;

            data all_found;

                set all_found found&j; 

            run;

        %end; 

    %end; 

%mend;

%put_together;  

data _null_;

    set all_found;

    if _n_ = 1 then put 'These values exist in all variables: ';

    put in_all;

run;

stat_sas
Ammonite | Level 13

data have;

input var1 var2 var3 var4 var5;

id=_n_;

datalines;

23    87    94    59   54

36    23    19    23   57

87    63    87    51   99

35    38    87    87   23

68    49    23    78   87

;

proc transpose data=have out=want;

by id;

run;

proc sql;

select col1 as value from want

group by col1

having count(distinct _name_)=(select max(id) from want);

quit;

_bhishek
Calcite | Level 5

Thanks stat@sas your answer was correct but is there any other way to do it without transposing it because in actual problem i have 25 variables and 1 lakh observations so when i will transpose it. then i will have 25 Lakh observation so will you suggest me to do the same way for my problem.

Ksharp
Super User

If you have 25 variable, you need change code as:

ha.definedata('k','v1','v2','v3','v4','v5');

----> ha.definedata('k','v1','v2','v3','v4','v5','v6',...........,'v25');

array x{*} var1-var5;

array y{*} v1-v5;

--->

array x{*} var1-var25;

array y{*} v1-v25;

Ksharp
Super User

Use Hash Table, If I understood what you mean.

Code: Program

data have;
input var1 var2 var3 var4 var5;
cards;
23 87 94 59 54
36 23 19 23 57
35 38 93 51 23
68 49 23 78 99 
;
run;
data want;
if _n_ eq 1 then do;
declare hash ha(hashexp:20);
declare hiter hi('ha');
ha.definekey('k');
ha.definedata('k','v1','v2','v3','v4','v5');
ha.definedone();
end;
set have end=last;
array x{*} var1-var5;
array y{*} v1-v5;
do i=1 to dim(x);
k=x{i};call missing(of y{*});
rc=ha.find();
y{i}=1;
ha.replace();
end;
if last then do;
do while(hi.next()=0);
  if nmiss(of y{*})=0 then output;
end;
end;
keep k;
run;

Xia Keshan

Bryan
Obsidian | Level 7

Here is another way.

data have;

input var1 var2 var3 var4 var5;

id=_n_;

datalines;

23    87    94    59   54

36    23    19    23   57

87    63    87    51   99

35    38    87    87   23

68    49    23    78   87

;

run;

proc contents data = have

  out = haveVars(keep = varnum name)

  noprint;

run;

Proc Sql noprint;

  Select count(name)

  into :NmbrVrs separated by ''

  from haveVars

  where name contains "var";

Quit;

%put &NmbrVrs   var&NmbrVrs;

%macro fndDplcts();

  Data lst (drop=var1 - var&NmbrVrs);

  Set have;

  %do p = 1 %to &NmbrVrs;

  Nmbr=var&&p;

  varNm="var&&p";

  VarNmId=compress(varNm||"_Id"||_N_);

  Cntr=1;

  output;

  %end;

  Run;

%mend;

%fndDplcts;

PROC SORT DATA=LST OUT=LstSrt;

  BY Nmbr;

RUN;

PROC TRANSPOSE DATA=LstSrt OUT=LstTrnsps

  NAME=Source LABEL=Label;

  BY Nmbr;

  ID VarNmId;

  VAR Cntr;

RUN; QUIT;

Data LstTrnsps (drop= Source);

  Set LstTrnsps;

  array a(*) _numeric_;

  do i=1 to dim(a);

  if a(i) = . then a(i) = 0;

  end;

  drop i;

run;

proc contents data = LstTrnsps

  out = LstTrnspsVars(keep = varnum name)

  noprint;

run;

Proc Sql noprint;

  Select name

  into :VrsLst separated by ' '

  from LstTrnspsVars

  where name contains "var";

Quit;

%put &VrsLst;

Data LstCnt;

  retain Nmbr Ttl &VrsLst;

  Set LstTrnsps;

  array vrArry &VrsLst;

  Ttl=0;

  do over vrArry;

  Ttl=Ttl+vrArry;

  end;

run;

Proc sort data=LstCnt;

  by descending Ttl;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 2925 views
  • 4 likes
  • 7 in conversation