Hello,
I create this function:
%macro unicite(input, output, x1, x2); PROC SQL INOBS=1; CREATE TABLE output AS SELECT (COUNT(DISTINCT(&x1))) AS &x1_&x2 FROM input GROUP BY &x2 ORDER BY &x1_&x2 DESC; QUIT; %mend;
I want to create matrix for this list:
%let list= col1 col2 col3 col4 col5;
my final result by applying function unicite must be like that:
col1_col1 | col1_col2 | col1_col3 | col1_col4 | col1_col5 |
col2_col1 | col2_col2 | col2_col3 | col2_col4 | col2_col5 |
col3_col1 | col3_col2 | col3_col3 | col3_col4 | col3_col5 |
col4_col1 | col4_col2 | col4_col3 | col4_col4 | col4_col5 |
col5_col1 | col5_col2 | col5_col3 | col5_col4 | col5_col5 |
Provide some actual example data in the form of a data step and what the actual result of the output should look like for that data.
I have no idea what a "col1_col2" would be and in general SQL is likely not the place do anything related to a "matrix".
/* this is my data set */
data have; input col1 $ col2 col3 $ col4 col5; cards; Jac 10 s 1 0 Joe 31 s 1 0 Joe 22 l 2 0 Roe 33 l 1 1 Rex 44 s 1 0 Tim 24 s 1 0 Joe 31 s 2 0 Joe 22 l 1 1 Roe 33 l 2 1 Rex 44 s 1 1 Tim 24 s 1 1 ; run;
until now I have achieved this result bu it's not exactly what I want
the result is just for col1 I want for all columns (col1 col2 ...) you can see in the function %unicite(have,want,col1);
%macro unicite(entree,sortie,col);
%let list=col1
col2
col3
col4
col5
;
data TABLE_INSPECTION;
set _NULL_;
run;
%do index = 1 %to %sysfunc(countw(&list,%str( )));
%let I =%scan(&list,&index,%str( ));
PROC SQL OUTOBS=1 ;
CREATE TABLE &sortie AS
SELECT
(COUNT(DISTINCT(&I))) AS &I
FROM &entree
GROUP BY &col
ORDER BY &I DESC;
QUIT;
data TABLE_INSPECTION;
set TABLE_INSPECTION &sortie;
run;
proc delete data=&sortie;
run;
%end;
%mend;
%unicite(have,want,col1);
You need to show what the result for that data should be.
Showing code that does not do as required is
And does the result need to be data set that will processed further or a Report that people with read?
BTW, it is poor form to include blank lines in Cards data.
the result should be data not report
@mazouz wrote:
the result should be data not report
Again, what does the result look like?
There is a serious consideration involving variable type because SAS will only allow a variable to be of one type and the way you are mixing "columns" needs a very clear example of what you expect.
BTW, what will be done with this data set next?
Anything someone calls a "matrix" usually properly belongs in Proc IML and SQL is right out.
It is within the framework of my end of studies project I want to check in a table which contains 5 variables, if each variable is unique for another variable.
I want to do this in an iterative fashion every time I set a variable in the group and count the values of the other variables one by one
at the end I will have cases when the value is 1 so it is unique and cases> 1 therefore not unique
I do not understand what all to words are getting around.
It would be much easier, as asked, what does the result for your example data look like?
Just because you started a method does not mean that is the way to actually accomplish this.
But we need to know what you expect for output.
And you have now added things that were not in the initial question such as: "unique for another variable".
So, how do you expect to display the unique vs non-unique? What are the expected variable names in the output? Actual values for every row for each variable in your output? Answer these questions using the example data set you provided.
My result should look like this table
col1 | col2 | col3 | col4 | col5 | |
col1 | 1 | 3 | 3 | 4 | 2 |
col2 | 2 | 1 | 2 | 2 | 2 |
col3 | 3 | 2 | 1 | 2 | 2 |
col4 | 2 | 2 | 1 | 1 | 2 |
col5 | 2 | 2 | 1 | 2 | 1 |
for exemple between col1 col1 we have 1 because we check for the same variable
PROC SQL INOBS=1; CREATE TABLE output AS SELECT (COUNT(DISTINCT(col1))) AS col1_col1 FROM input GROUP BY col1 ORDER BY col1_col1 DESC; QUIT;
@mazouz wrote:
My result should look like this table
col1 col2 col3 col4 col5 col1 1 3 3 4 2 col2 2 1 2 2 2 col3 3 2 1 2 2 col4 2 2 1 1 2 col5 2 2 1 2 1
for exemple between col1 col1 we have 1 because we check for the same variablePROC SQL INOBS=1; CREATE TABLE output AS SELECT (COUNT(DISTINCT(col1))) AS col1_col1 FROM input GROUP BY col1 ORDER BY col1_col1 DESC; QUIT;
Since that proc sql is restricted on one record I don't see anyway that you would ever get more than 1. So why the restriction INOBS=1????
Since your col1 variable has 5 unique values, with counts of 1, 2 and 4 when crossed with itself I don't really understand your output.
So step by step demonstrate, not with code but with words and by showing the explicit values used to show some of those counts.
no I will not always have 1 as a result try this code and you will see.
data have; input col1 $ col2 col3 $ col4 col5; cards; Jac 10 s 1 0 Joe 31 s 1 0 Joe 22 l 2 0 Roe 33 l 1 1 Rex 44 s 1 0 Tim 24 s 1 0 Joe 31 s 2 0 Joe 22 l 1 1 Roe 33 l 2 1 Rex 44 s 1 1 Tim 24 s 1 1 ; run; PROC SQL outobs=1; CREATE TABLE output AS SELECT (COUNT(DISTINCT(col3))) AS col1_col1 FROM have GROUP BY col1 ORDER BY col1_col1 DESC; QUIT;
the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique
@mazouz wrote:
no I will not always have 1 as a result try this code and you will see.
data have; input col1 $ col2 col3 $ col4 col5; cards; Jac 10 s 1 0 Joe 31 s 1 0 Joe 22 l 2 0 Roe 33 l 1 1 Rex 44 s 1 0 Tim 24 s 1 0 Joe 31 s 2 0 Joe 22 l 1 1 Roe 33 l 2 1 Rex 44 s 1 1 Tim 24 s 1 1 ; run; PROC SQL outobs=1; CREATE TABLE output AS SELECT (COUNT(DISTINCT(col3))) AS col1_col1 FROM have GROUP BY col1 ORDER BY col1_col1 DESC; QUIT;
the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique
You changed the SQL code. Previously you had INOBS=1. So without any further actual worked example of the process I am done here.
it's mistake sorry
Calling all data scientists and open-source enthusiasts! Want to solve real problems that impact your company or the world? Register to hack by August 31st!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.