apply function and creating matrix

mazouz · Posted 11-09-2020 08:23 AM

Hello,

I create this function:

%macro unicite(input, output, x1, x2);
PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(&x1))) AS &x1_&x2
      	   FROM input
           GROUP BY &x2
           ORDER BY &x1_&x2 DESC;
QUIT;
%mend;

I want to create matrix for this list:

%let list= col1 col2 col3 col4 col5;

my final result by applying function unicite must be like that:

col1_col1	col1_col2	col1_col3	col1_col4	col1_col5
col2_col1	col2_col2	col2_col3	col2_col4	col2_col5
col3_col1	col3_col2	col3_col3	col3_col4	col3_col5
col4_col1	col4_col2	col4_col3	col4_col4	col4_col5
col5_col1	col5_col2	col5_col3	col5_col4	col5_col5

ballardw · Posted 11-09-2020 10:26 AM

Provide some actual example data in the form of a data step and what the actual result of the output should look like for that data.

I have no idea what a "col1_col2" would be and in general SQL is likely not the place do anything related to a "matrix".

mazouz · Posted 11-09-2020 10:57 AM

/* this  is my data set */
data have;
input col1 $ col2 col3 $ col4 col5;
cards;

Jac 10 s 1 0

Joe 31 s 1 0

Joe 22 l 2 0

Roe 33 l 1 1

Rex 44 s 1 0

Tim 24 s 1 0

Joe 31 s 2 0

Joe 22 l 1 1

Roe 33 l 2 1

Rex 44 s 1 1

Tim 24 s 1 1

;
run;

until now I have achieved this result bu it's not exactly what I want

the result is just for col1 I want for all columns (col1 col2 ...) you can see in the function %unicite(have,want,col1);

%macro unicite(entree,sortie,col);
%let list=col1 
col2 
col3
col4 
col5 
;
data TABLE_INSPECTION;
set _NULL_;
run;
%do index = 1 %to %sysfunc(countw(&list,%str( )));
%let I =%scan(&list,&index,%str( ));
PROC SQL OUTOBS=1 ;
CREATE TABLE &sortie AS 
SELECT 
(COUNT(DISTINCT(&I))) AS &I
FROM &entree 
GROUP BY &col
ORDER BY &I DESC;
QUIT;
data TABLE_INSPECTION;
set TABLE_INSPECTION &sortie;
run;
proc delete data=&sortie;
run;
%end;
%mend;
%unicite(have,want,col1);

ballardw · Posted 11-09-2020 11:26 AM

You need to show what the result for that data should be.

Showing code that does not do as required is

And does the result need to be data set that will processed further or a Report that people with read?

BTW, it is poor form to include blank lines in Cards data.

mazouz · Posted 11-09-2020 11:36 AM

the result should be data not report

ballardw · Posted 11-09-2020 11:47 AM

@mazouz wrote:

the result should be data not report

Again, what does the result look like?

There is a serious consideration involving variable type because SAS will only allow a variable to be of one type and the way you are mixing "columns" needs a very clear example of what you expect.

BTW, what will be done with this data set next?

Anything someone calls a "matrix" usually properly belongs in Proc IML and SQL is right out.

mazouz · Posted 11-09-2020 12:38 PM

It is within the framework of my end of studies project I want to check in a table which contains 5 variables, if each variable is unique for another variable.
I want to do this in an iterative fashion every time I set a variable in the group and count the values of the other variables one by one
at the end I will have cases when the value is 1 so it is unique and cases> 1 therefore not unique

ballardw · Posted 11-09-2020 12:58 PM

I do not understand what all to words are getting around.

It would be much easier, as asked, what does the result for your example data look like?

Just because you started a method does not mean that is the way to actually accomplish this.

But we need to know what you expect for output.

And you have now added things that were not in the initial question such as: "unique for another variable".

So, how do you expect to display the unique vs non-unique? What are the expected variable names in the output? Actual values for every row for each variable in your output? Answer these questions using the example data set you provided.

mazouz · Posted 11-09-2020 01:21 PM

My result should look like this table

	col1	col2	col3	col4	col5
col1	1	3	3	4	2
col2	2	1	2	2	2
col3	3	2	1	2	2
col4	2	2	1	1	2
col5	2	2	1	2	1

for exemple between col1 col1 we have 1 because we check for the same variable

PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col1))) AS col1_col1
      	   FROM input
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

ballardw · Posted 11-09-2020 01:46 PM

@mazouz wrote:

My result should look like this table

col1 col2 col3 col4 col5

col1 1 3 3 4 2

col2 2 1 2 2 2

col3 3 2 1 2 2

col4 2 2 1 1 2

col5 2 2 1 2 1

for exemple between col1 col1 we have 1 because we check for the same variable
PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col1))) AS col1_col1
      	   FROM input
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

Since that proc sql is restricted on one record I don't see anyway that you would ever get more than 1. So why the restriction INOBS=1????

Since your col1 variable has 5 unique values, with counts of 1, 2 and 4 when crossed with itself I don't really understand your output.

So step by step demonstrate, not with code but with words and by showing the explicit values used to show some of those counts.

mazouz · Posted 11-09-2020 02:46 PM

no I will not always have 1 as a result try this code and you will see.

data have;
input col1 $ col2 col3 $ col4 col5;
cards;

Jac 10 s 1 0

Joe 31 s 1 0

Joe 22 l 2 0

Roe 33 l 1 1

Rex 44 s 1 0

Tim 24 s 1 0

Joe 31 s 2 0

Joe 22 l 1 1

Roe 33 l 2 1

Rex 44 s 1 1

Tim 24 s 1 1

;
run;
PROC SQL outobs=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col3))) AS col1_col1
      	   FROM have
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique

ballardw · Posted 11-09-2020 02:51 PM

@mazouz wrote:

no I will not always have 1 as a result try this code and you will see.
data have;
input col1 $ col2 col3 $ col4 col5;
cards;

Jac 10 s 1 0

Joe 31 s 1 0

Joe 22 l 2 0

Roe 33 l 1 1

Rex 44 s 1 0

Tim 24 s 1 0

Joe 31 s 2 0

Joe 22 l 1 1

Roe 33 l 2 1

Rex 44 s 1 1

Tim 24 s 1 1

;
run;
PROC SQL outobs=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col3))) AS col1_col1
      	   FROM have
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;
the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique

You changed the SQL code. Previously you had INOBS=1. So without any further actual worked example of the process I am done here.

mazouz · Posted 11-09-2020 03:04 PM

it's mistake sorry

apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Re: apply function and creating matrix

Registration is open

SAS Training: Just a Click Away