BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sebster24
Quartz | Level 8

Hello All,

I am trying to create a list where a certain string is deleted from a string list.

for example: considering the third row in the table below where year=2007M,2011M, the value 2007M,2011M should be deleted from year_list and with end result displayed in variable "want" being "2008,2009,2010". for this same example, the start should be 2007M, as that is the first observation in variable "year".

THe FINAL table that i want should be as follows:

year

year_list

want

2005M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2007M,2008,2009,2010,2011M

2005M,2007M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2008,2009,2010,2011M

2007M,2011M

2005M,2006,2007M,2008,2009,2010,2011M

2008,2009,2010

I tried arrays to solve this problem, but not able to get anywhere close to the first step.

I list below my code:

proc sql;
create table have (year VARCHAR(100), year_list VARCHAR(100));
insert into have values('2005M','2005M,2006,2007M,2008,2009,2010,2011M');
insert into have values('2005M,2007M','2005M,2006,2007M,2008,2009,2010,2011M');
insert into have values('2007M,2011M','2005M,2006,2007M,2008,2009,2010,2011M');
run;

data trial (keep=x);
set have(obs=1);
cnt=countw(year_list,",");
i=0;
do while (i<cnt);
i+1;
x=cats("dlv",scan(strip(year_list),i,","));
output;
end;
run;

proc sql ;
select distinct cats("mart",max(countw(year,","))), max(countw(year,",")) into :arraylist,:arraycnt from have;
select distinct cats("dlv",max(countw(year_list,","))), max(countw(year_list,",")) into :dlv_array, :dlvcnt from have;
select distinct x into :array_list separated by ' ' from trial;
run;

data want ;
set have;
length &array_list. $100.;
array dlv(&dlvcnt.) $ &array_list.;
do _j=1 to dim(dlv);
if index(upcase(vname(dlv(_j))),upcase(cats("DLV",dlv[_j]))) then do;
dlv[_j]=scan(year,_j,',');
end;
end;
run;

 

The array code in the last data step does not display the result i want for the first step of arriving to the final table. The table should look more like:

year year_list dlv2005M dlv2006 dlv2007M dlv2008 dlv2009 dlv2010 dlv2011M
2005M 2005M,2006,2007M,2008,2009,2010,2011M 2005M            
2005M,2007M 2005M,2006,2007M,2008,2009,2010,2011M 2005M   2007M        
2007M,2011M 2005M,2006,2007M,2008,2009,2010,2011M     2007M       2011M

After getting this table, I look at finding all missing values in the table and listing these out.

Still left with the dilemma of how to determine the first observation 😞

Any pointers, help, tip would be appreciated.

thanking you

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

I am sure you have posted this question in the last few weeks, no?  Anyways, as always my first question is going to be why?  SAS is built around the construct of datasets containing variables which hold One! item per variable.  Your trying to fit many data items into one variable, this is just going to cause you headaches through programming.  I would suggest the following, normalise your data, have one row per id value.  You can then easily apply any programming to that dataset and then if for some reason you need the final, transpose it up again when you have finished - remember the data you program with does not need to look like any out format.

However, that being said, you can process it as is:

data have;
  length year year_list $200;
  input year $ year_list $;
datalines;
2005M 2005M,2006,2007M,2008,2009,2010,2011M
2005M,2007M 2005M,2006,2007M,2008,2009,2010,2011M
;
run;
data want;
  length want $200;
  set have;
  do i=1 to countw(year_list,",");
    if index(year,scan(year_list,i,","))=0 then want=catx(",",want,scan(year_list,i,","));
  end;
run;

View solution in original post

4 REPLIES 4
sebster24
Quartz | Level 8

Hello All,

I am trying to create a list where a certain string is deleted from a string list.

I adopted the array approach as in the code for the following reasons:

1) The year and year_list is in character format and comma separated. the year and year_list values are comma separated and sorted.

2) Variable "want" lists all variables from "year_list2 that are not present in variable "year", however ensuring that the starting point is determined by the first value from variable "year" in "year_list" - as seen in the last row.

THe table that i want should be as follows:

year

year_list

want

2005M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2007M,2008,2009,2010,2011M

2005M,2007M

2005M,2006,2007M,2008,2009,2010,2011M

2006,2008,2009,2010,2011M

2007M,2011M

2005M,2006,2007M,2008,2009,2010,2011M

2008,2009,2010

 

proc sql;
create table have (year VARCHAR(100), year_list VARCHAR(100));
insert into have values('2005M','2005M,2006,2007M,2008,2009,2010,20​11M');
insert into have values('2005M,2007M','2005M,2006,2007M,2008,2009,2​010,2011M');
insert into have values('2007M,2011M','2005M,2006,2007M,2008,2009,2​010,2011M');
run;

data tmp1;
set have;
csv=countw(year,",");
codgen_csv=countw(year_list,",");
run;


proc sql ;
select distinct cats("mart",max(csv)), max(csv) into :arraylist,:arraycnt from tmp1;
select distinct cats("dlv",max(codgen_csv)), max(codgen_csv) into :dlv_array, :dlvcnt from tmp1;
run;

data want (drop=list);
set tmp1;
length mart1-&arraylist. $100.;
length dlv1-&dlv_array. $100.;
array mart(&arraycnt.) $;
array dlv(&dlvcnt.) $;
do _i=1 to dim(mart);
mart[_i]=scan(year,_i,',');
end;
do _j=1 to dim(dlv);
dlv[_j]=scan(year_list,_j,',');
end;
run;

I am really struggling with this. And i don't know how to approach this problem.

 

Any pointers/ help would be appreciated.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

I am sure you have posted this question in the last few weeks, no?  Anyways, as always my first question is going to be why?  SAS is built around the construct of datasets containing variables which hold One! item per variable.  Your trying to fit many data items into one variable, this is just going to cause you headaches through programming.  I would suggest the following, normalise your data, have one row per id value.  You can then easily apply any programming to that dataset and then if for some reason you need the final, transpose it up again when you have finished - remember the data you program with does not need to look like any out format.

However, that being said, you can process it as is:

data have;
  length year year_list $200;
  input year $ year_list $;
datalines;
2005M 2005M,2006,2007M,2008,2009,2010,2011M
2005M,2007M 2005M,2006,2007M,2008,2009,2010,2011M
;
run;
data want;
  length want $200;
  set have;
  do i=1 to countw(year_list,",");
    if index(year,scan(year_list,i,","))=0 then want=catx(",",want,scan(year_list,i,","));
  end;
run;
sebster24
Quartz | Level 8

Sorry for the pestering and double emails. I am not pleased with the format structure that i presented in the post.THis is data that i receive from third party and a request to deliver in a better format is something they cannot do 😞

Sir RW9, you are a genius.

Smiley Very Happy 

 

For anyone having a similar query, slightly modified solution below: to produce desired outcome


data want (drop=i);
	length want $200;
	set have;
	do i=1 to countw(year_list,",");
		if index(year,scan(substr(year_list,max(index(year_list,scan(year,1,",")),1)),i,","))=0 then
			want=catx(",",want,scan(substr(year_list,max(index(year_list,scan(year,1,",")),1)),i,","));
	end;
run;

thank you

 

Ksharp
Super User
I am not sure I understand what is your mean completely .

data have;
infile cards dlm=' ';
input year :$20. year_list : $100. ;
cards;
2005M 2005M,2006,2007M,2008,2009,2010,2011M
2005M,2007M 2005M,2006,2007M,2008,2009,2010,2011M
2007M,2011M 2005M,2006,2007M,2008,2009,2010,2011M
;
run;
data want;
 set have;
 length want $ 200;
 first=scan(year,1,',');
 s=findw(year_list,strip(first));
 temp=substr(year_list,s);
 do i=1 to countw(temp,',');
  t=scan(temp,i,',');
  if not findw(year,strip(t)) then want=catx(',',want,t);
 end;
 drop t i s temp first ;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1164 views
  • 0 likes
  • 3 in conversation