BookmarkSubscribeRSS Feed
mazouz
Calcite | Level 5

Hello,

I create this function: 

 

%macro unicite(input, output, x1, x2);
PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(&x1))) AS &x1_&x2
      	   FROM input
           GROUP BY &x2
           ORDER BY &x1_&x2 DESC;
QUIT;
%mend;

 

I want to create matrix for this list:

%let list= col1 col2 col3 col4 col5;

my final result by applying function unicite must be like that:

 

col1_col1col1_col2col1_col3col1_col4col1_col5
col2_col1col2_col2col2_col3col2_col4col2_col5
col3_col1col3_col2col3_col3col3_col4col3_col5
col4_col1col4_col2col4_col3col4_col4col4_col5
col5_col1col5_col2col5_col3col5_col4col5_col5
12 REPLIES 12
ballardw
Super User

Provide some actual example data in the form of a data step and what the actual result of the output should look like for that data.

 

I have no idea what a "col1_col2" would be and in general SQL is likely not the place do anything related to a "matrix".

 

 

mazouz
Calcite | Level 5

 

/* this  is my data set */
data have; input col1 $ col2 col3 $ col4 col5; cards; Jac 10 s 1 0 Joe 31 s 1 0 Joe 22 l 2 0 Roe 33 l 1 1 Rex 44 s 1 0 Tim 24 s 1 0 Joe 31 s 2 0 Joe 22 l 1 1 Roe 33 l 2 1 Rex 44 s 1 1 Tim 24 s 1 1 ; run;

 

until now I have achieved this result bu it's not exactly what I want 

the result is just for col1 I want for all columns (col1 col2 ...) you can see in the function %unicite(have,want,col1);

 

 

%macro unicite(entree,sortie,col);
%let list=col1
col2
col3
col4
col5
;
data TABLE_INSPECTION;
set _NULL_;
run;
%do index = 1 %to %sysfunc(countw(&list,%str( )));
%let I =%scan(&list,&index,%str( ));
PROC SQL OUTOBS=1 ;
CREATE TABLE &sortie AS
SELECT
(COUNT(DISTINCT(&I))) AS &I
FROM &entree
GROUP BY &col
ORDER BY &I DESC;
QUIT;
data TABLE_INSPECTION;
set TABLE_INSPECTION &sortie;
run;
proc delete data=&sortie;
run;
%end;
%mend;
%unicite(have,want,col1);

 

 

ballardw
Super User

You need to show what the result for that data should be.

Showing code that does not do as required is

And does the result need to be data set that will processed further or a Report that people with read?

 

BTW, it is poor form to include blank lines in Cards data.

mazouz
Calcite | Level 5

the result should be data not report

ballardw
Super User

@mazouz wrote:

the result should be data not report


Again, what does the result look like?

 

There is a serious consideration involving variable type because SAS will only allow a variable to be of one type and the way you are mixing "columns" needs a very clear example of what you expect.

 

BTW, what will be done with this data set next?

Anything someone calls a "matrix" usually properly belongs in Proc IML and SQL is right out.

mazouz
Calcite | Level 5

It is within the framework of my end of studies project I want to check in a table which contains 5 variables, if each variable is unique for another variable.
I want to do this in an iterative fashion every time I set a variable in the group and count the values of the other variables one by one
at the end I will have cases when the value is 1 so it is unique and cases> 1 therefore not unique

ballardw
Super User

I do not understand what all to words are getting around.

It would be much easier, as asked, what does the result for your example data look like?

 

Just because you started a method does not mean that is the way to actually accomplish this.

But we need to know what you expect for output.

 

And you have now added things that were not in the initial question such as: "unique for another variable".

So, how do you expect to display the unique vs non-unique? What are the expected variable names in the output? Actual values for every row for each variable in your output? Answer these questions using the example data set you provided.

mazouz
Calcite | Level 5

My result should look like this table

 col1col2col3col4col5
col113342
col221222
col332122
col422112
col522121


for exemple between col1 col1 we have 1 because we check for the same variable 

PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col1))) AS col1_col1
      	   FROM input
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

 

ballardw
Super User

@mazouz wrote:

My result should look like this table

  col1 col2 col3 col4 col5
col1 1 3 3 4 2
col2 2 1 2 2 2
col3 3 2 1 2 2
col4 2 2 1 1 2
col5 2 2 1 2 1


for exemple between col1 col1 we have 1 because we check for the same variable 

PROC SQL INOBS=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col1))) AS col1_col1
      	   FROM input
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

 


Since that proc sql is restricted on one record I don't see anyway that you would ever get more than 1. So why the restriction INOBS=1????

 

Since your col1 variable has 5 unique values, with counts of 1, 2 and 4 when crossed with itself I don't really understand your output.

So step by step demonstrate, not with code but with words and by showing the explicit values used to show some of those counts.

 

 

mazouz
Calcite | Level 5

no I will not always have 1 as a result try this code and you will see.

 

data have;
input col1 $ col2 col3 $ col4 col5;
cards;

Jac 10 s 1 0

Joe 31 s 1 0

Joe 22 l 2 0

Roe 33 l 1 1

Rex 44 s 1 0

Tim 24 s 1 0

Joe 31 s 2 0

Joe 22 l 1 1

Roe 33 l 2 1

Rex 44 s 1 1

Tim 24 s 1 1

;
run;
PROC SQL outobs=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col3))) AS col1_col1
      	   FROM have
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

 

the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique

 

ballardw
Super User

@mazouz wrote:

no I will not always have 1 as a result try this code and you will see.

 

data have;
input col1 $ col2 col3 $ col4 col5;
cards;

Jac 10 s 1 0

Joe 31 s 1 0

Joe 22 l 2 0

Roe 33 l 1 1

Rex 44 s 1 0

Tim 24 s 1 0

Joe 31 s 2 0

Joe 22 l 1 1

Roe 33 l 2 1

Rex 44 s 1 1

Tim 24 s 1 1

;
run;
PROC SQL outobs=1;
    CREATE TABLE output AS 
           SELECT 
	 	   (COUNT(DISTINCT(col3))) AS col1_col1
      	   FROM have
           GROUP BY col1
           ORDER BY col1_col1 DESC;
QUIT;

 

the reason why I do outobs = 1 because I have ranked the observations in descending order and I am only interested in the highest value if it's equal 1 then the variable is unique

 


You changed the SQL code. Previously you had INOBS=1. So without any further actual worked example of the process I am done here.

mazouz
Calcite | Level 5

it's mistake sorry

hackathon24-white-horiz.png

Join the 2025 SAS Hackathon!

Calling all data scientists and open-source enthusiasts! Want to solve real problems that impact your company or the world? Register to hack by August 31st!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2058 views
  • 0 likes
  • 2 in conversation