BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
r4321
Pyrite | Level 9

Hello everyone, first time poster, and I would really appreciate your help. Hopefully in laymens terms, since Im not super advanced with SAS. Though I use it quite a bit, I have a hard time if the code is not narrated. 

 

 

So I have two variables.... and I want to compare the number combinations (separated by a comma) in column 1 with the number combinations (separated by a comma) in column 2. Every time my 'code new' variable has a number combination that is not in 'code old,' i want to have a count created as a separate variable. Im hoping that the count variable tabulates the number of codes (if any) in the 'code new' variable that are not in the 'code old' variable. Furthermore, each observation may have differing amounts of number cominations to be compared. Let me illustrate with an example 

 

 

Code Old  (column 1)                              Code New (column 2)                                   Count

6222,6233,7645,6555                             6233,6555,6222,7645                                     0

7222, 7212, 7233, 7244,7555                 7244,7233,7222                                              0

5444, 5441                                              7244,5444                                                       1

 

 

So while I dont care if a code is in the 'code old' set, but not found in the 'code new' set. I DO want it to count if a number is in the 

code new set that is not in the code old set... 

 

I hope this sounds clear. While I feel like I could easily do it in excel, Im really struggling with it in SAS. And the dataset Im coding is

too big for excel. 

 

Thanks for your help in advance! Advice is appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Your problem is a simple one, made harder by the choice in structure.  Now you can achieve the result, as below, however if you do further processing on those codes it is far from ideal to code with that strcuture.  SAS is built so that each variable contains one data item, not a list.  So separating out the codes would be far better.  A second suggestion I would make is to normalise the structure, as given in example 2:

data have (drop=i);
  code_old="6222,6233,7645,6555"; code_new="6233,6555,6222,7645"; 
  do i=1 to countw(code_new,",");
    if findw(code_old,scan(code_new,i,","),",") > 0 then count=sum(count,1);
  end;
run; 

Example 2 (a bit more work than for your given example, but for futher processing it would save you time):

data have;
  input code_old $ code_new $;
datalines;
6222 6233
6233 6555
7645 6222
6555 7645
;
run;
proc sql;
  create table WANT as
  select  count(CODE_NEW) as COUNT
  from    HAVE
  where   CODE_NEW in (select CODE_OLD from HAVE);
quit;

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

Is this some kind of training/homework? Cause I think I've a quite similar question the other week. Perhaps you could search around the communities for that thread.

 

In short, i see two ways:

  1. Use do-loops, scan the strings for individual codes, and build your comparison logic from there.
  2. A more robust data management way is make sure that your observations/rows has a unique id. Then separate new and old into separate data sets, and use SQL to figure out there's any new codes for your id.
Data never sleeps
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Your problem is a simple one, made harder by the choice in structure.  Now you can achieve the result, as below, however if you do further processing on those codes it is far from ideal to code with that strcuture.  SAS is built so that each variable contains one data item, not a list.  So separating out the codes would be far better.  A second suggestion I would make is to normalise the structure, as given in example 2:

data have (drop=i);
  code_old="6222,6233,7645,6555"; code_new="6233,6555,6222,7645"; 
  do i=1 to countw(code_new,",");
    if findw(code_old,scan(code_new,i,","),",") > 0 then count=sum(count,1);
  end;
run; 

Example 2 (a bit more work than for your given example, but for futher processing it would save you time):

data have;
  input code_old $ code_new $;
datalines;
6222 6233
6233 6555
7645 6222
6555 7645
;
run;
proc sql;
  create table WANT as
  select  count(CODE_NEW) as COUNT
  from    HAVE
  where   CODE_NEW in (select CODE_OLD from HAVE);
quit;
r4321
Pyrite | Level 9

Hello everyone, thank you for your quick replies. It is for a research project Im working on and I am still struggling with this, however.

 

So I do have the data codes in individual columns, like so                

 

oldcode1     oldcode2    oldcode3  oldcode4  newcode1 newcode2 etc

 

 

However, I thought it may have been easier to concatenate them into a single column, this may be wrong because in certain instances I end up cells that have values like:    5666,  ,6574,6979,  ,7878                       ........ etc. 

 

Anyhow each observation (or row) in my dataset DOES have a unique identifier and I need to create a count variable based on comparing the new codes with the old codes, and how many of the new codes are different from the old codes. 

 

I tried the proc sql command that one of you was nice enough to provide and it simply returned a small table with a count and no identifiers, etc. That is not a format I can use, unfortunately. I need the count to be assigned to the unique observation. 

 

Again, Im still a SAS beginner so any narration you can provide is very much appreciated. 

Thanks again to everyone who has replied. 

Reeza
Super User

@r4321 wrote:

 

Again, Im still a SAS beginner so any narration you can provide is very much appreciated. 

 


 

First thing to learn - how to narrate your problem.

 

Please take the time to provide some detailed information. 

This includes what your data looks like, what you're trying to achieve as an end goal and what you've tried so we don't suggest things that you've already tried and don't work for you. Posting explicit sample input and output are a great starting point. I guarantee you'll get answers much faster as well. 

 

I like the tips here, on how to ask a good question. 

http://stackoverflow.com/help/how-to-ask

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1619 views
  • 1 like
  • 4 in conversation