Desktop productivity for business analysts and programmers

compare arrays and list missing values between arrays

Reply
Contributor
Posts: 44

compare arrays and list missing values between arrays

Hello All,

I am trying to create a list where a certain string is deleted from a string list.

for example: the last row of the following table, "want" column, notes the next observation as 2009, as variable year begins with 2008. In other words, the first variable mentioned in year acts as the starting point and every consequent value missing from year when compared to year_list is noted in want. The end point is the last year mentioned in year_list.

The final result should like this:

year year_list want
2008 2005,2006,2007,2008,2009,2010,2011 2010,2011
2005,2007 2005,2006,2007,2008,2009,2010,2011 2006,2008,2009,2010,2012
2008,2011 2005,2006,2007,2008,2009,2010,2011 2009,2010

I managed to create an array splitting these comma separated values into different variables. however, I am not able to do the comparison, especailly with the starting point.

Please find below my code:

proc sql;
create table have (year VARCHAR(100), year_list VARCHAR(100));
insert into have values('2005','2005,2006,2007,2008,2009,2010,2011'​);
insert into have values('2005,2007','2005,2006,2007,2008,2009,2010,​2011');
insert into have values('2007,2011','2005,2006,2007,2008,2009,2010,​2011');
run;

data tmp1;
set have;
csv=countw(year,",");
codgen_csv=countw(year_list,",");
run;


proc sql ;
select distinct cats("mart",max(csv)), max(csv) into :arraylist,:arraycnt from tmp1;
select distinct cats("dlv",max(codgen_csv)), max(codgen_csv) into :dlv_array, :dlvcnt from tmp1;
run;

data want (drop=list);
set tmp1;
length mart1-&arraylist. $100.;
length dlv1-&dlv_array. $100.;
array mart(&arraycnt.) $;
array dlv(&dlvcnt.) $;
do _i=1 to dim(mart);
mart[_i]=scan(year,_i,',');
end;
do _j=1 to dim(dlv);
dlv[_j]=scan(year_list,_j,',');
end;
run;

Please do let me know.

 

Many thanks.

Super User
Posts: 11,134

Re: compare arrays and list missing values between arrays

Are we supposed to be able to get to your displayed "want" from the provided "have"? The values for the Year variable don't seem to match. Your first example want has Year of 2008 but the example data has 2005, similar for the third row, Want Year is 2008,2001 but "have" is 2005, 2011.

 

And why would 2009 be excluded from your want variable for the first row?

 

Then you create a list of variables from a data set you don't provide. Is this you you actually build data similar to the example Have?

It may be that you want to clarify how you are using this as the bit of building an awkward list to process in a decidely odd manner could be accomplished without those "list" variables.

 

Also, it may be a good idea to post code using the "Run" icon. the other seems to have inserted some non-printable characters that make your create "have" data to fail when posted into a SAS editor.

Contributor
Posts: 44

Re: compare arrays and list missing values between arrays

Hello Ballardw,

Apologies for the confusion earlier.

I have reviewed the "WANT" table and reinserted the script using the run button.

THe table that i want should be as follows:

year

year_list

want

2005M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2007M,2008,2009,2010,2011M

2005M,2007M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2008,2009,2010,2011M

2007M,2011M

2005M,2006,2007M,2008,2009,2010,2011M

2008,2009,2010

The reason for adopting the array approach as in the code is as follows:

1) The year and year_list is in character format and comma separated. the year and year_list values are comma separated and sorted.

2) Variable "want" lists all variables from "year_list2 that are not present in variable "year", however ensuring that the starting point is determined by the first value from variable "year" in "year_list" - as seen in the last row. 

proc sql;
create table have (year VARCHAR(100), year_list VARCHAR(100));
insert into have values('2005M','2005M,2006,2007M,2008,2009,2010,2011M');
insert into have values('2005M,2007M','2005M,2006,2007M,2008,2009,2010,2011M');
insert into have values('2007M,2011M','2005M,2006,2007M,2008,2009,2010,2011M');
run;

data tmp1;
set have;
csv=countw(year,",");
codgen_csv=countw(year_list,",");
run;


proc sql ;
select distinct cats("mart",max(csv)), max(csv) into :arraylist,:arraycnt from tmp1;
select distinct cats("dlv",max(codgen_csv)), max(codgen_csv) into :dlv_array, :dlvcnt from tmp1;
run;

data want (drop=list);
set tmp1;
length mart1-&arraylist. $100.;
length dlv1-&dlv_array. $100.;
array mart(&arraycnt.) $;
array dlv(&dlvcnt.) $;
do _i=1 to dim(mart);
mart[_i]=scan(year,_i,',');
end;
do _j=1 to dim(dlv);
dlv[_j]=scan(year_list,_j,',');
end;
run;

 

Possibly this approach is not the best. If you have a better approach/solution in mind, please do let me know.

Thanking you.

Ask a Question
Discussion stats
  • 2 replies
  • 244 views
  • 0 likes
  • 2 in conversation