Solved: Re: Proper use of POINT option in my case

Doug____ · Posted 04-22-2020 09:27 AM

I two datasets, one I will call "Main" which contains the data to be updated and the transaction dataset. The update method is not the usual way in that the transaction dataset can update the main record multiple times. This is because the value of the updated variable can change over time but must be linked back to the original value. Example follows below:

Main

value

24

Transaction

Value1 Value2

24 32 <- first change to this value

32 44<<-this is the correct final value as transactions happen at two time points. Each line in the transaction set is unique.

Basically I need to hold the pointer in MAIN at record 1 while I read all the records in TRANSACTION, applying the updates as I go for as many times as needed.

How can I use the POINT = option to achieve this result?

Astounding · Posted 04-22-2020 11:23 AM

A suggested approach:

Create a temporary array with 999999999 elements (one for every conceivable SSN). Use the Transaction data set to load into the array all the changes.

If SSN 123456789 should map to 234567890, then the 123456789th element should have a value of 234567890.

The programming should be relatively simple, provided that you actually have the memory available to create the array.

View solution in original post

RichardDeVen · Posted 04-22-2020 09:48 AM

Have you written any code ?
Are you also using MODIFY ?
Does your Main have a unique index ? If so, have you tried MODIFY / INDEX= ?

Doug____ · Posted 04-22-2020 09:57 AM

The only unique value is SSN. The value to be updated can change multiple times but this method allows me to get to the last value. The system is based on heterogenous joins which is one update per single record in the master dataset it not correct. It is a many to many operation whereby the updated value has value x today, value y tomorrow and z the day after. But the system does not provide information for the path from X straight to Z is only has information for X - Y then Y - Z. Difficult to explain but that's how the system is designed.

Tom · Posted 04-22-2020 09:51 AM

I don't think you have provided enough information to understand what you are doing.

The UPDATE statement used in a DATA step (not the SQL update statement) can apply multiple transactions to the same master record. Do you have a transaction dataset in the sense of the UPDATE statement?

Or are you asking how to change the KEY variable(s) values?

Kurt_Bremser · Posted 04-22-2020 10:31 AM

Do you only have values, and no keys?

A multiple transaction is handled well by data step update, see here:

input id $ value;
datalines;
A 1
B 2
;

data trans;
input id $ value;
datalines;
A 3
A 4
B 5
;

data main;
update
  main
  trans
;
by id;
run;

The last value for A will end up in the main dataset.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Doug____ · Posted 04-22-2020 10:59 AM

If there were only one observation per SSN in the master dataset this would work. But there are many so warnings are generated. Not sure the update method will accomplish this which takes us back to my original question. Reading in each record from the master sequentially then applying the updates using the POINT option in some fashion.

Kurt_Bremser · Posted 04-22-2020 11:07 AM

And how do you identify the "main" observation that you need to apply a specific transaction to?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Doug____ · Posted 04-22-2020 11:49 AM

This is where sequential access comes into play - by using the observation number in the main table - the code should sit on that observation while is rolls through the transaction dataset applying the changes as it goes. Once it applies all the changes - move to the next record and repeat and so forth until it encounters the end of the main dataset. POINT would indicate the observation number in main.

Kurt_Bremser · Posted 04-22-2020 11:54 AM

This is too complicated to describe in words. Please post examples for the main and transaction datasets (in data steps with datalines like I did), and the expected result from both.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Astounding · Posted 04-22-2020 11:23 AM

A suggested approach:

Create a temporary array with 999999999 elements (one for every conceivable SSN). Use the Transaction data set to load into the array all the changes.

If SSN 123456789 should map to 234567890, then the 123456789th element should have a value of 234567890.

The programming should be relatively simple, provided that you actually have the memory available to create the array.

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away