Hello everyone, first time poster, and I would really appreciate your help. Hopefully in laymens terms, since Im not super advanced with SAS. Though I use it quite a bit, I have a hard time if the code is not narrated.
So I have two variables.... and I want to compare the number combinations (separated by a comma) in column 1 with the number combinations (separated by a comma) in column 2. Every time my 'code new' variable has a number combination that is not in 'code old,' i want to have a count created as a separate variable. Im hoping that the count variable tabulates the number of codes (if any) in the 'code new' variable that are not in the 'code old' variable. Furthermore, each observation may have differing amounts of number cominations to be compared. Let me illustrate with an example
Code Old (column 1) Code New (column 2) Count
6222,6233,7645,6555 6233,6555,6222,7645 0
7222, 7212, 7233, 7244,7555 7244,7233,7222 0
5444, 5441 7244,5444 1
So while I dont care if a code is in the 'code old' set, but not found in the 'code new' set. I DO want it to count if a number is in the
code new set that is not in the code old set...
I hope this sounds clear. While I feel like I could easily do it in excel, Im really struggling with it in SAS. And the dataset Im coding is
too big for excel.
Thanks for your help in advance! Advice is appreciated.
Your problem is a simple one, made harder by the choice in structure. Now you can achieve the result, as below, however if you do further processing on those codes it is far from ideal to code with that strcuture. SAS is built so that each variable contains one data item, not a list. So separating out the codes would be far better. A second suggestion I would make is to normalise the structure, as given in example 2:
data have (drop=i); code_old="6222,6233,7645,6555"; code_new="6233,6555,6222,7645"; do i=1 to countw(code_new,","); if findw(code_old,scan(code_new,i,","),",") > 0 then count=sum(count,1); end; run;
Example 2 (a bit more work than for your given example, but for futher processing it would save you time):
data have; input code_old $ code_new $; datalines; 6222 6233 6233 6555 7645 6222 6555 7645 ; run; proc sql; create table WANT as select count(CODE_NEW) as COUNT from HAVE where CODE_NEW in (select CODE_OLD from HAVE); quit;
Is this some kind of training/homework? Cause I think I've a quite similar question the other week. Perhaps you could search around the communities for that thread.
In short, i see two ways:
Your problem is a simple one, made harder by the choice in structure. Now you can achieve the result, as below, however if you do further processing on those codes it is far from ideal to code with that strcuture. SAS is built so that each variable contains one data item, not a list. So separating out the codes would be far better. A second suggestion I would make is to normalise the structure, as given in example 2:
data have (drop=i); code_old="6222,6233,7645,6555"; code_new="6233,6555,6222,7645"; do i=1 to countw(code_new,","); if findw(code_old,scan(code_new,i,","),",") > 0 then count=sum(count,1); end; run;
Example 2 (a bit more work than for your given example, but for futher processing it would save you time):
data have; input code_old $ code_new $; datalines; 6222 6233 6233 6555 7645 6222 6555 7645 ; run; proc sql; create table WANT as select count(CODE_NEW) as COUNT from HAVE where CODE_NEW in (select CODE_OLD from HAVE); quit;
See the questions from @sebster24 recently, they align quite closely with yours.
Hello everyone, thank you for your quick replies. It is for a research project Im working on and I am still struggling with this, however.
So I do have the data codes in individual columns, like so
oldcode1 oldcode2 oldcode3 oldcode4 newcode1 newcode2 etc
However, I thought it may have been easier to concatenate them into a single column, this may be wrong because in certain instances I end up cells that have values like: 5666, ,6574,6979, ,7878 ........ etc.
Anyhow each observation (or row) in my dataset DOES have a unique identifier and I need to create a count variable based on comparing the new codes with the old codes, and how many of the new codes are different from the old codes.
I tried the proc sql command that one of you was nice enough to provide and it simply returned a small table with a count and no identifiers, etc. That is not a format I can use, unfortunately. I need the count to be assigned to the unique observation.
Again, Im still a SAS beginner so any narration you can provide is very much appreciated.
Thanks again to everyone who has replied.
@r4321 wrote:
Again, Im still a SAS beginner so any narration you can provide is very much appreciated.
First thing to learn - how to narrate your problem.
Please take the time to provide some detailed information.
This includes what your data looks like, what you're trying to achieve as an end goal and what you've tried so we don't suggest things that you've already tried and don't work for you. Posting explicit sample input and output are a great starting point. I guarantee you'll get answers much faster as well.
I like the tips here, on how to ask a good question.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.