BookmarkSubscribeRSS Feed
krazee_koder
Calcite | Level 5

Hi all-

 

I could really use some help with Base SAS programming.

 

I have a dataset with a unique identifier called clt_id. There are also 50 variables var1-var50 that have unique identifiers for property ownership.

 

So I have 51 collums and I would like to create an identifier that links the clt_id's together when there is a match between any of the 50 property identifiers.

 

So the collums look like this

clt_id   var1 ... var50

 

and I would like to have:

clt_id var1 ... var50 varm

where varm would be the same for two clt_ids if any var from one clt_id matched with any other var under another clt_id. There are a maximum of 2 occurances for each var so varm should only contain pairs or unique values.

 

I would like the identifier to be a concatanation of the two relevant clt_ids.

 

ie.

if first.clt_id > second.clt_id then ident = cats(of second.clt_id first.clt_id);
if first.clt_id < second.clt_id then ident = cats(of first.clt_id second.clt_id);

 

I tried stacking the collums into one called var, sort ascending var clt_id , then using the property ownership identifier on second.var as the identifier.

 

Then i sorted descending and did the same to apply the ownership identifier to the other match. This should work since I am only interested in matching in twos.

 

Heres what I have so far:

%macro stack();
	data stacked;
	set
	%do i=1 %to 50;
		have (keep=var&i clt_id rename=(var&i=var))
	%end;
;
	run;
%mend stack;
%stack();

proc sort data=stacked;
	by ascending var clt_id;
run;

data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then varm=var;
	end;
run;

data sort data=stacked;
	by descending var clt_id;
run;

data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then dinm=var;
run;

 

 I received the following error on my last data step:

 

ERROR 650-185: DATA STEP Component Object failure.
               Aborted during the COMPILATION phase.
ERROR 557-185: Variable second is not an object.

 

I thought I could refer to first and second when using by after the set. Anyone know why I am getting this error?

 

Thanks so much,

Krazee

4 REPLIES 4
novinosrin
Tourmaline | Level 20

Hi, What is second.var? That seems interesting. Has SAS introduced an automatic variable second.var recently?

Do you mind charting out a sample table of your HAVE dataset and a WANT dataset with a few records to help me visualise better please. Thank you!

Astounding
PROC Star

Some of the rules before this gets too krazy ...

 

There is no such thing as "second." in SAS.  You have "first." and "last." and that's it.

 

To be allowed to use "first." and "last." there must be a BY statement earlier in the DATA step, using the same variable name.  So if your BY statement reads "by din;" then you have first.din and last.din to work with, but there is no last.var to work with.

 

To be allowed to use a BY statement, your data must be sorted by the variable(s) in the BY statement.  There are exceptions to this if you specify the word NOTSORTED, but save that for another day.

 

If you want help fixing your application, that might be possible.  I would advise skipping macro language entirely, however, until you get a working version of your program.  Then you can begin to think about how to add macro language to generate the same working program.

Shmuel
Garnet | Level 18

It is pitty you havn't post the full log. It is difficult to know which line made the ERROR without the full log.

 

You cannot use SECOND.VAR - what did you mean by it?

You can use first.var meaning first row of a value when sorted by var.

You can get the previous row value of the variable VAR using LAG function. (like: prev = lag(var); )

 

I cannot guess what you meant by 

data stacked;
	do _n_=1 by 1 until(last.var);
	set stacked;
	by var;
	if second.var then dinm=var;
run;

What is DINM ?

Where do you check that vars match?

If you have both CLT_IDs in memory and they are numeric you can conctenate them by:

varm = min(clt_id1, clt_id2) || max(clt_id1, clt_id2);

and no need to sort twice (ascending and descending).

 

 

 

ballardw
Super User

Here's an idea.

Provide some example data. You need not provide all 50 variables but 3 or 4 should be sufficient to show the procees.

Then show what the final result should look like for that data.

 

I have a feeling that you are making this way more complicated than it should be.

 

NOTE that First. and Last. refer to the ROW order in the data set. Also the value of First. is either 1 or 0 for true (the record is the first fo the specified group) or false (the record is not the first of the group).

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 845 views
  • 0 likes
  • 5 in conversation