DATA Step, Macro, Functions and more

matching

Reply
Contributor
Posts: 61

matching

Hi,

I am using Proc Optnet for matching my cases with controls (both as independent datasets), please see below. I can do that in Work library; however, when I try to do that for my permamnet folder/Libref it does not work. The log says: Work.Links.Data not available. I have one file as pe.Links in my permanent library. How can I add libref with proc optnent? Thanks

proc optnet data_links=links graph_direction=directed;
data_links_var from=sampleNode to=controlNode weight=distance;
linear_assignment out=bestMatches;
run;

proc sql;
select 
    floor(sampleNode) as sampleId,
    controlNode as controlId
from bestMatches;
quit;

 

Trusted Advisor
Posts: 1,115

Re: matching

Hi @wajmsu,

 

So, you have a file links.sas7bdat in your permanent folder and the libref pe has been assigned to this folder?

Have you tried proc optnet data_links=pe.links ...?

Contributor
Posts: 61

Re: matching

Hi FreelanceReinhard,

Thanks, it worked as far as not getting any log error. However, the very next issue comes about the 'from' and 'to' as shown below:

 

OTE: ----------------------------------------------------------------------------------------------
ERROR: The FROM and TO variables in the DATA_LINKS= data set must have the same type.
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set PE.BESTMATCHES may be incomplete.  When this step was stopped there were 0
         observations and 0 variables.
WARNING: Data set PE.BESTMATCHES was not replaced because this step was stopped.
NOTE: PROCEDURE OPTNET used (Total process time):
      real time           0.23 seconds
      cpu time            0.21 seconds

36


37   proc sql;
38   select
39       floor(sampleNode) as sampleId,
40       controlNode as controlId
41   from pe.bestMatches;
ERROR: Table PE.BESTMATCHES doesn't have any columns. PROC SQL requires each of its tables to have
       at least 1 column.
ERROR: Function FLOOR requires a numeric expression as argument 1.
ERROR: The following columns were not found in the contributing tables: controlNode, sampleNode.
42   quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.00 seconds
      cpu time            0.01 seconds

Trusted Advisor
Posts: 1,115

Re: matching

I think the error message is fairly explicit: "The FROM and TO variables in the DATA_LINKS= data set must have the same type."

This makes sense, because both should contain elements of the same set: the set of nodes of a graph. These can be labeled with character strings (e.g. 'A', 'B', 'C', ...) or numbers (e.g. 1, 2, 3, ...). 

 

It appears that you specified a character variable and a numeric variable.

Contributor
Posts: 61

Re: matching

You are right; Controlnode is character fromat and informat=$4.

Samplenode is numeric format=10.2 ans infromat=10.

 

Controlnode is 4 digits and samplenode is 0000.00. I want to to convert to numeric, is it ok, please guide me? What should be the format and informat for controlnode (changing from character to numeric)?

Thanks

Trusted Advisor
Posts: 1,115

Re: matching

[ Edited ]

I'm not familiar with PROC OPTNET, but I assume that neither the format nor the informat of a variable, but the variable value is most important for the FROM and TO variables.

 

This being the case, it seems a bit odd to me that the values of sampleNode and controlNode look so different. Format 10.2 would make sense only if sampleNode contained values with non-zero decimal places such as 1234.56. Obviously, numbers like this would not fit into a character variable of length 4. In your PROC SQL step you applied the FLOOR function to sampleNode. This again suggests that there are decimal places..

 

Of course, you can convert controlNode to a numeric variable:

data links_new;
set links;
numcn=input(controlNode, 4.);
drop controlNode;
rename numcn=controlNode;
run;

However, you should make sure that the numeric values in variable sampleNode are in fact integers. Please note that values of numeric variables in SAS sometimes look like integers (e.g. 3), but a close examination reveals that their true value is something like 3.0000000000000004. To avoid this, you could add a statement like sampleNode=round(sampleNode) or sampleNode=floor(sampleNode) to the above data step, but these are not equivalent. You have to know your data and whether non-integers should be rounded or truncated to integers in order to match controlNode values.

Contributor
Posts: 61

Re: matching

I have added format for conrolNode in the sas code as shown below. This format was not present in the sas code previosuly. However, it does not change the type in the data file but log says: 'converted from character to numeric'. WHen I run proc optnet statement, the message comes.

Thanks for your time and support!

 

Ask a Question
Discussion stats
  • 6 replies
  • 218 views
  • 0 likes
  • 2 in conversation