Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

how to find same values from two variables ?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

how to find same values from two variables ?

I have 23 variable and i want all those values which are coming in all the variable ( it doesn't matter that all should be in the same row ).

For Example:

var1 var2 var3 var4 var5

23    87    94      59   54

36    23    19      23   57

35    38     93    51     23

68    49     23    78     99  

So  we can see that all the variable have 23 as a value so i want 23 as answer .


Accepted Solutions
Solution
‎07-15-2015 01:25 PM
Super User
Posts: 11,343

Re: how to find same values from two variables ?

One way. The 5 is the variable count. I'll leave that as a nearly trivial exercise for the interested reader to get the value of the number of variables involved into a variable as needed.

data have;
input var1 var2 var3 var4 var5;
datalines;
23    87    94      59   54
36    23    19      23   57
87    63     87    51    99      
35    38     87    87     23
68    49     23    78     87
;
run;

data trans (keep= name value);
   set have;
   array v var1 var2 var3 var4 var5;
   length name $ 32;
   do j = 1 to dim(v);
      value = v;
      name =  vname(v);
      output;
   end;
run;

proc sql;
   select value
   from (
         select value, count(*) as vcount
         from (select distinct value,name from trans)
         group by value
        )
   where vcount=5;
   ;
quit;

View solution in original post


All Replies
Super User
Posts: 11,343

Re: how to find same values from two variables ?

Please show an example of what you expect the output to look like. What if you have 2 or more values with duplicate values? Do you want to know how many matches?

Occasional Contributor
Posts: 5

Re: how to find same values from two variables ?

I want ALL the values which are present in ALL the variables. I do want how many matches are there.

For Example:

var1 var2 var3 var4 var5

23    87    94      59   54

36    23    19      23   57

87    63     87    51    99      

35    38     87    87     23

68    49     23    78     87

like in this example 23 and 87 are present in all the variable so i want output as 23 87.

I think it's clear now if not you can ask me again and thanks for the reply ballardw.

Solution
‎07-15-2015 01:25 PM
Super User
Posts: 11,343

Re: how to find same values from two variables ?

One way. The 5 is the variable count. I'll leave that as a nearly trivial exercise for the interested reader to get the value of the number of variables involved into a variable as needed.

data have;
input var1 var2 var3 var4 var5;
datalines;
23    87    94      59   54
36    23    19      23   57
87    63     87    51    99      
35    38     87    87     23
68    49     23    78     87
;
run;

data trans (keep= name value);
   set have;
   array v var1 var2 var3 var4 var5;
   length name $ 32;
   do j = 1 to dim(v);
      value = v;
      name =  vname(v);
      output;
   end;
run;

proc sql;
   select value
   from (
         select value, count(*) as vcount
         from (select distinct value,name from trans)
         group by value
        )
   where vcount=5;
   ;
quit;

Occasional Contributor
Posts: 5

Re: how to find same values from two variables ?

Thanks ballardw your answer was correct but is there any other way to do it without transposing it because in actual problem i have 25 variables and 1 lakh observations so when i will transpose it. then i will have 25 Lakh observation so will you suggest me to do the same way for my problem.

Super User
Posts: 11,343

Re: how to find same values from two variables ?

Increase the variables from VAR1  to VAR5 to be the names of all 25 variables.

I don't know what a lakh may be, but the code should run. What is the concern about the number of observations?

Occasional Contributor
Posts: 9

Re: how to find same values from two variables ?

Are you wanting to know what number all 23 variables have in common? So, do you want your output to just show what number appears in EVERY column?

var1 var2 var3 ...

23     23     23  etc...

Occasional Contributor
Posts: 5

Re: how to find same values from two variables ?

Yes ebills , I want my output to show all the numbers which are in every column.

Contributor
Posts: 74

Re: how to find same values from two variables ?

data x1;

    length var1 var2 var3 var4 var5 8;

    input var1 var2 var3 var4 var5;

datalines;

23    87    94      59   54

36    23    19      23   57

87    63     87    51    99     

35    38     87    87     23

68    49     23    78     87

run;

%let nvars=5;

%macro check;

%do i = 01 %to 99;

    %let found = N;

    %global found&i;

    data found&i;

        set x1 end = at_end;

        retain flag1-flag5 0;  /* set to 0 for each record */

        array vars {*} var1-var&nvars;

        array flags {*} flag1-flag&nvars;

        checkvar = &i;

        do i = 1 to &nvars; /* for the first value in the first record, 23, check this against the var2, 3, 4 and 5 in the same record */

            if checkvar = vars{i} then flags{i} = 1;  /* do not sum, just flag if any */

        end;

        if at_end and sum(of flag1-flag5) = &nvars then  /* if found in all, then each flag will be 1 and the total will be 5 */

        do;

            put checkvar ' EXISTS IN ALL COLUMNS. ';

            in_all = checkvar;

            output found&i;

            call symputx("found&i", 'Y');  /* macro variable set to Y if found in all vars */

        end;

        else call symputx("found&i",'N');

    run;

    %put found&i=&&found&i;

%end;

%mend;

%check;

   

%macro put_together;

    data all_found;  /* create empty dataset */

        stop;

    run;

    %do j = 01 %to 99;

        %*put found&j=&&found&j;

        %if &&found&j = Y %then          /* append if there are any */

        %do;

            data all_found;

                set all_found found&j; 

            run;

        %end; 

    %end; 

%mend;

%put_together;  

data _null_;

    set all_found;

    if _n_ = 1 then put 'These values exist in all variables: ';

    put in_all;

run;

Trusted Advisor
Posts: 1,228

Re: how to find same values from two variables ?

data have;

input var1 var2 var3 var4 var5;

id=_n_;

datalines;

23    87    94    59   54

36    23    19    23   57

87    63    87    51   99

35    38    87    87   23

68    49    23    78   87

;

proc transpose data=have out=want;

by id;

run;

proc sql;

select col1 as value from want

group by col1

having count(distinct _name_)=(select max(id) from want);

quit;

Occasional Contributor
Posts: 5

Re: how to find same values from two variables ?

Thanks stat@sas your answer was correct but is there any other way to do it without transposing it because in actual problem i have 25 variables and 1 lakh observations so when i will transpose it. then i will have 25 Lakh observation so will you suggest me to do the same way for my problem.

Super User
Posts: 10,041

Re: how to find same values from two variables ?

If you have 25 variable, you need change code as:

ha.definedata('k','v1','v2','v3','v4','v5');

----> ha.definedata('k','v1','v2','v3','v4','v5','v6',...........,'v25');

array x{*} var1-var5;

array y{*} v1-v5;

--->

array x{*} var1-var25;

array y{*} v1-v25;

Super User
Posts: 10,041

Re: how to find same values from two variables ?

Use Hash Table, If I understood what you mean.

Code: Program

data have;
input var1 var2 var3 var4 var5;
cards;
23 87 94 59 54
36 23 19 23 57
35 38 93 51 23
68 49 23 78 99 
;
run;
data want;
if _n_ eq 1 then do;
declare hash ha(hashexp:20);
declare hiter hi('ha');
ha.definekey('k');
ha.definedata('k','v1','v2','v3','v4','v5');
ha.definedone();
end;
set have end=last;
array x{*} var1-var5;
array y{*} v1-v5;
do i=1 to dim(x);
k=x{i};call missing(of y{*});
rc=ha.find();
y{i}=1;
ha.replace();
end;
if last then do;
do while(hi.next()=0);
  if nmiss(of y{*})=0 then output;
end;
end;
keep k;
run;

Xia Keshan

Contributor
Posts: 28

Re: how to find same values from two variables ?

Here is another way.

data have;

input var1 var2 var3 var4 var5;

id=_n_;

datalines;

23    87    94    59   54

36    23    19    23   57

87    63    87    51   99

35    38    87    87   23

68    49    23    78   87

;

run;

proc contents data = have

  out = haveVars(keep = varnum name)

  noprint;

run;

Proc Sql noprint;

  Select count(name)

  into :NmbrVrs separated by ''

  from haveVars

  where name contains "var";

Quit;

%put &NmbrVrs   var&NmbrVrs;

%macro fndDplcts();

  Data lst (drop=var1 - var&NmbrVrs);

  Set have;

  %do p = 1 %to &NmbrVrs;

  Nmbr=var&&p;

  varNm="var&&p";

  VarNmId=compress(varNm||"_Id"||_N_);

  Cntr=1;

  output;

  %end;

  Run;

%mend;

%fndDplcts;

PROC SORT DATA=LST OUT=LstSrt;

  BY Nmbr;

RUN;

PROC TRANSPOSE DATA=LstSrt OUT=LstTrnsps

  NAME=Source LABEL=Label;

  BY Nmbr;

  ID VarNmId;

  VAR Cntr;

RUN; QUIT;

Data LstTrnsps (drop= Source);

  Set LstTrnsps;

  array a(*) _numeric_;

  do i=1 to dim(a);

  if a(i) = . then a(i) = 0;

  end;

  drop i;

run;

proc contents data = LstTrnsps

  out = LstTrnspsVars(keep = varnum name)

  noprint;

run;

Proc Sql noprint;

  Select name

  into :VrsLst separated by ' '

  from LstTrnspsVars

  where name contains "var";

Quit;

%put &VrsLst;

Data LstCnt;

  retain Nmbr Ttl &VrsLst;

  Set LstTrnsps;

  array vrArry &VrsLst;

  Ttl=0;

  do over vrArry;

  Ttl=Ttl+vrArry;

  end;

run;

Proc sort data=LstCnt;

  by descending Ttl;

run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 13 replies
  • 676 views
  • 4 likes
  • 7 in conversation