New SAS User

kothasaikrishna · Posted 07-16-2019 06:18 PM

Hi I have dataset as follows:

obs col1

1. ar/t; br/t

2. k

3. m-p, i

I need to create another dataset as follows:

obs col1

1. a

2 b

3 k

4 m

5 n

6 o

7 p

8 i

can you help me with that. Thanks!

Reeza · Posted 07-16-2019 10:32 PM

Thanks, it was dinner time 🙂

data have;
	original="ar/t; br/t";
	output;
	original="k";
	output;
	original="m-p, i";
	output;
run;

data want;
	set have;
	*number of terms;
	n_terms=countc(original, ";,")+1;
	
	*remove r/t;
    p1=tranwrd(original, 'r/t', '');


	do i=1 to n_terms;
	    *separate into items;
		term=scan(p1, i, ';, ');

        *check if fields contains hyphen;
		if find(term, '-') then
			do;
			    *find start and end of loop by converting to ascii and back;
				start_letter=rank(scan(term, 1, '-'));
				end_letter=rank(scan(term, 2, '-'));

                *loop through and output for each letter;
				do j=start_letter to end_letter;
					term=byte(j);
					output;
				end;
			end;
	     *if no hyphen output;
	     else output;

	end;
	keep term original n_terms;
run;

View solution in original post

Reeza · Posted 07-16-2019 06:31 PM

Are those all the rules? How do you know which to expand and which to truncate?

kothasaikrishna · Posted 07-16-2019 06:37 PM

I have series alphabets from m to p which were mentioned as m-p, I need to expand those and put them in multiple rows

Reeza · Posted 07-16-2019 06:40 PM

So we can ignore the first two observations and everything from them? Only worry about that last record then with the hyphen? It's always a hyphen and no other fields have hyphens? If we don't understand the rules, we can't code a solution.

kothasaikrishna · Posted 07-16-2019 06:55 PM

I have the instructions that i need to take out r/t in the first observation and put the two alphabets a, b in two different observations, for the third observation i need to expand from m to p and put them in different observations and alphabet i comes as the last observation. Thanks!

heffo · Posted 07-16-2019 07:35 PM

This is a start, don't know your exact rules.

data want;
	length colout1 $1 _coltemp $ 11;
	set have;
	do _i = 1 to countw(col1,":,"); *loop all the words;
		_coltemp = strip(scan(col1,_i,":,")); 
		*Three different cases, a list, a span or a single value.;
		if index(_coltemp,"-")>0 then do;
			*Span, Might be problematic if the casing is different from start to end. ;
			_start = scan(_coltemp,1,"-");
			_end = scan(_coltemp,2,"-");
			*Rank gets the ASCII value for us to loop from start to end.;
			do _j = rank(_start) to rank(_end);
				*Use byte function to go from ASCII to string. ;
				colout1 = byte(_j);
				output;
			end;
		end;
		else if index(_coltemp,":")>0 then do;
			*A list of values separated (in this case with : ). ;
			do _j =1 to countw(_coltemp,":");
				*Loop all of the values. ;
				colout1 = scan(_coltemp,_j);
				output;
			end;
		end;
		else do;
			*Single value version;
			colout1 = _coltemp;
			output;
		end;
	end;
	drop _:;
run;

Reeza · Posted 07-16-2019 09:38 PM

Not sure why this isn't working for the last i, but need to be done. Perhaps someone else can fix it, it's pretty close.

data have;
	original="ar/t; br/t";
	output;
	original="k";
	output;
	original="m-p, i";
	output;
run;

data want;
	set have;
	*number of terms;
	n_terms=countc(original, ";,")+1;
    p1=compress(original, 'r/t');

	do i=1 to n_terms;
		term=scan(p1, i, ';, ');

		if find(term, '-') then
			do;
				start_letter=rank(scan(term, 1, '-'));
				end_letter=rank(scan(term, 2, '-'));

				do i=start_letter to end_letter;
					term=byte(i);
					output;
				end;
			end;
	     else output;

	end;
	keep term original;
run;

heffo · Posted 07-16-2019 10:13 PM

You are reusing the counter variable (i) in your inner loop. So, it will be more than two the second time it tries to do the outer loop. 🙂

Also, I would use

p1=tranwrd(original, 'r/t','');

instead of the compress. Compress removes all chars, not just that exact string. So, if you have "t-u", then the code will remove "t" as well.

Reeza · Posted 07-16-2019 10:32 PM

Thanks, it was dinner time 🙂

data have;
	original="ar/t; br/t";
	output;
	original="k";
	output;
	original="m-p, i";
	output;
run;

data want;
	set have;
	*number of terms;
	n_terms=countc(original, ";,")+1;
	
	*remove r/t;
    p1=tranwrd(original, 'r/t', '');


	do i=1 to n_terms;
	    *separate into items;
		term=scan(p1, i, ';, ');

        *check if fields contains hyphen;
		if find(term, '-') then
			do;
			    *find start and end of loop by converting to ascii and back;
				start_letter=rank(scan(term, 1, '-'));
				end_letter=rank(scan(term, 2, '-'));

                *loop through and output for each letter;
				do j=start_letter to end_letter;
					term=byte(j);
					output;
				end;
			end;
	     *if no hyphen output;
	     else output;

	end;
	keep term original n_terms;
run;

kothasaikrishna · Posted 07-17-2019 01:08 PM

Thank you so much! that works.

New SAS User

Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

Re: Creating multiple records from single record

View activity records in SAS Viya

ID variable in proc transpose has multiple values per record

Re: Multiple record into Single record

creating multiple records from single record

Single record from multiple records while keeping flags

Follow Us

What is...

New SAS User

Join us for our biggest event of the year!

Follow Us

What is...