BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rajdeep
Pyrite | Level 9

Hi Team,

 

I am trying a code to execute with the below code, but the ACCOUNT_KEY column values getting plotted randomly from the ACCOUNT_DIM table to the output table.

 

Please explain how to freeze ACCOUNT_KEY values against the ACCOUNT_NUMBER column, because every time I run the below step, the ACCOUNT_KEY values are getting changed in the output table for the same Account_numbers.

 

DATA TEST(KEEP= FTF_IND ACCOUNT_NUMBER ACCOUNT_KEY APP_ID TABLET_ID IMAGE_ID ACQUISITIONTYPE);
SET ACCOUNT_DIM(IN=A KEEP= ACCOUNT_NUMBER ACCOUNT_KEY CHANGE_CURRENT_IND ACCOUNT_REGISTRATION_TYPE_DESC where=(CHANGE_CURRENT_IND='Y'));
SET ACQTYPE(IN=B RENAME=(MAST_ACCOUNT_ID=ACCOUNT_NUMBER));
BY ACCOUNT_NUMBER;
IF A AND B THEN OUTPUT;
RUN;

Please help if possible. 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@rajdeep wrote:

Hi KurtBremser,

 
Thanks a lot for your quick response. I also found the same idea that, it's giving the expected result with Merge, but can you please explain why it's random with multiple SET statement and if suppose we want to make it happen with SET statement then can we achieve the expected output.
 
Thanks in advance. 

Not random just complicated. As I said, multiple set statements without experience are difficult to work with. If you do not control the order of data then it can get more unpredictable.

View solution in original post

11 REPLIES 11
PaigeMiller
Diamond | Level 26

You talk about plots, but you don't show us the plot code or the plot output. It would also help if you showed us a portion of this data.

--
Paige Miller
ballardw
Super User

Really need to provide a small example of data from both sets and what you expect the result of the data step to be for that example.

 

Multiple set statements seldom do what you expect when using them without a LOT of experience.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

Tom
Super User Tom
Super User

The BY and IF statement are doing nothing in that data step (other than possible causing SAS to use an index when accessing the second dataset).

 

What is it you are trying to do?  Why not just merge them BY the variable?

Astounding
PROC Star

I suspect you are trying to match by account_number.  To do that, the two SET statements need to be replaced.  This is probably the correct replacement:

 

merge ACCOUNT_DIM (IN=A KEEP= ACCOUNT_NUMBER ACCOUNT_KEY CHANGE_CURRENT_IND ACCOUNT_REGISTRATION_TYPE_DESC where=(CHANGE_CURRENT_IND='Y'))
      ACQTYPE     (IN=B RENAME=(MAST_ACCOUNT_ID=ACCOUNT_NUMBER));

If that's incorrect, you will need to explain more about what the outcome should be.

RichardDeVen
Barite | Level 11
Is the data master/detail ? In other words, for each account_number in ACCOUNT_DIM do you have multiple associated records in ACQTYPE ?
rajdeep
Pyrite | Level 9

Hi Richard,

 

Please find the sample test data.

data ACCOUNT_DIM;
INPUT ACCOUNT_KEY 1-7 ACCOUNT_NUMBER $9-14 ACCOUNT_REGISTRATION_TYPE_DESC $ CHANGE_CURRENT_IND $1.;
datalines;
1067347	234057 FTF N
1035020	194500 FTF Y
1067350	234060 FTF Y
1067351	234061 FTF Y
1067352	234062 FTF Y
;
run;

data ACQTYPE;
INPUT ACCOUNT_NUMBER $1-6 ACQUISITIONTYPE $7-11 FTF_IND $13-16 APP_ID 18-23  TABLET_ID $25-26 IMAGE_ID $28-29;
datalines;
234061 FTFT FTF  268074    
234062 FTFT FTF  268074    
111111 NFTF NFTF 268074    
111123 NFTF NFTF 268074    
111134 NFTF NFTF 268074    
;
run;
proc sort data=ACQTYPE;
by account_number;
run;


DATA TEST(KEEP= FTF_IND ACCOUNT_NUMBER ACCOUNT_KEY APP_ID TABLET_ID IMAGE_ID ACQUISITIONTYPE);
set ACCOUNT_DIM(IN=A KEEP= ACCOUNT_NUMBER ACCOUNT_KEY CHANGE_CURRENT_IND ACCOUNT_REGISTRATION_TYPE_DESC where=(CHANGE_CURRENT_IND='Y')) ;
 set ACQTYPE(IN=B );
BY ACCOUNT_NUMBER;
IF A AND B THEN OUTPUT;
RUN;

I just want the 1067351 & 1067352 ACCOUNT_KEYs should be mapped against the 234061 & 234062 ACCOUNT_NUMBERs in the output dataset i.e TEST. If you try executing again and again you can see the random ACCOUNT_KEYs is coming ahead of 234061 & 234062 ACCOUNT_NUMBERS.

 

Thanks for your quick reply. please let me know if there's any disconnect.

 

@

PaigeMiller extremely sorry for the plot word confusion, I meant mapping actually.
@
ballardw Thanks for your reply. you are correct, experience matters a lot. I am still learning actually.
 
 
 
Kurt_Bremser
Super User

So it should be a simple MERGE by account_number?

data ACCOUNT_DIM;
input
  ACCOUNT_KEY 1-7
  ACCOUNT_NUMBER $9-14
  ACCOUNT_REGISTRATION_TYPE_DESC $
  CHANGE_CURRENT_IND $1.
;
datalines;
1067347 234057 FTF N
1035020 194500 FTF Y
1067350 234060 FTF Y
1067351 234061 FTF Y
1067352 234062 FTF Y
;

data ACQTYPE;
input
  ACCOUNT_NUMBER $1-6
  ACQUISITIONTYPE $7-11
  FTF_IND $13-16
  APP_ID 18-23
  TABLET_ID $25-26
  IMAGE_ID $28-29
;
datalines;
234061 FTFT FTF  268074    
234062 FTFT FTF  268074    
111111 NFTF NFTF 268074    
111123 NFTF NFTF 268074    
111134 NFTF NFTF 268074    
;

proc sort data=ACQTYPE;
by account_number;
run;

data test (
  keep=
    FTF_IND ACCOUNT_NUMBER ACCOUNT_KEY APP_ID
    TABLET_ID IMAGE_ID ACQUISITIONTYPE
);
merge
  ACCOUNT_DIM (
    in=a
    keep=
      ACCOUNT_NUMBER ACCOUNT_KEY CHANGE_CURRENT_IND
      ACCOUNT_REGISTRATION_TYPE_DESC
    where=(CHANGE_CURRENT_IND='Y')
  )
  ACQTYPE (in=b)
;
by account_number;
if a and b;
run;
rajdeep
Pyrite | Level 9

Hi KurtBremser,

 
Thanks a lot for your quick response. I also found the same idea that, it's giving the expected result with Merge, but can you please explain why it's random with multiple SET statement and if suppose we want to make it happen with SET statement then can we achieve the expected output.
 
Thanks in advance. 
Tom
Super User Tom
Super User

@rajdeep wrote:

Hi KurtBremser,

 
Thanks a lot for your quick response. I also found the same idea that, it's giving the expected result with Merge, but can you please explain why it's random with multiple SET statement and if suppose we want to make it happen with SET statement then can we achieve the expected output.
 
Thanks in advance. 

There is nothing random about it (unless one of the actual datasets is coming from a database that does not provide the observations in the same order each time).  When you SET a dataset it reads the next observation from that dataset. If no more observations are available the data step stops.   So in each iteration your data step will read one observation from A and then one observation from B.  If there are variables in B that were also in A then the values from B will overwrite the values read from A.  The number of observations created will be the minimum of the number in A and the number in B.

 

If you want to match the observations on ACCOUNT_NUMBER then list them both on a MERGE statement.

 

If you want to interleave the observations by ACCOUNT_NUMBER then list them both on one SET statement. Then you will get sum of the number of observations. Any variables in A that are not in B will have missing values on the observations that are read from B (and the reverse).  If the same ACCOUNT_NUMBER appears in both A and B then both will stay as separate observations with the observations from A appearing before the observations from B.

ballardw
Super User

@rajdeep wrote:

Hi KurtBremser,

 
Thanks a lot for your quick response. I also found the same idea that, it's giving the expected result with Merge, but can you please explain why it's random with multiple SET statement and if suppose we want to make it happen with SET statement then can we achieve the expected output.
 
Thanks in advance. 

Not random just complicated. As I said, multiple set statements without experience are difficult to work with. If you do not control the order of data then it can get more unpredictable.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 1627 views
  • 1 like
  • 7 in conversation