BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kevinmc87
Obsidian | Level 7
STATE YEAR RACE F_0 -- F_2000
CA    2010   W
CA    2010   B
CA    2010   H      

Hi,

 

I'm trying to tranpose this data where I essentially move these 2nd and 3rd rows as columns, where I have attached B and H as a part of the variable names from F_0 -- F_2000. I have tried messing around with PROC TRANSPOSE, but this doesn't seem to work for me. I'm not sure if it does, since I am not actually truly transposing the data, as I want to keep F_0--F_2000 as columns. I want something that looks like this:

STATE YEAR RACE F_0 -- F_2000 B_F_0-- B_F_2000 H_F_0--H_F_2000
CA 2010 - 

I'm able to move my data around using this code below, but it requires me having all of my variables coded as either character or numeric, and I'm wondering if there is a simpler, more elegant solution. 

DATA WANT;
	SET HAVE NOBS=NOBS;
	ARRAY ALL [3,2003] $ A1-A6009;
	ARRAY VARS [*] $ STATE--F_2000;
	RETAIN A1-A6009;
	I+1;
	DO J=1 TO 2003;
	ALL(I,J)=VARS(J);
	END;
	IF _N_=NOBS THEN OUTPUT;
RUN;
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Nice idea @Ksharp. But why not read dataset have directly? That way the order of race does not matter

 

data work.have;
	length STATE $ 2 YEAR 8 RACE $ 1 F_0-F_10 8;	
	input State Year Race F_0-F_10;
	datalines;
CA 2010 W 10 11 12 13 14 15 16 17 18 19 20
CA 2010 B 30 31 32 33 34 35 36 37 38 39 40
CA 2010 H 50 51 52 53 54 55 56 57 58 59 60
CA 2011 W 44 44 44 44 44 44 44 44 44 44 44
CA 2011 B 22 22 22 22 22 22 22 22 22 22 22
CA 2011 H 88 88 88 88 88 88 88 88 88 88 88
;

proc sql;
select distinct catt('have(where=(race="',race,'")
 rename=(F_0-F_10=',race,'_F_0-',race,'_F_10))') 
  into : list separated by ' '
   from have;
quit;

data want;
 merge &list;
 by State Year;
 drop race;
run;
PG

View solution in original post

11 REPLIES 11
Reeza
Super User

I think your code is about as efficient as you're going to get. 

 

But, are you actually trying to create a dataset that's a single line?

error_prone
Barite | Level 11

Why do you need all those columns in one row? 

 

Some parts of the data-step could be generated by reading the required information from HAVE and sashelp.vcolumns, but the code will most likely harder to read.

 

data work.have;
	length STATE $ 2 YEAR 8 RACE $ 1 F_0-F_10 8;
	
	input State Year Race F_0-F_10;
	datalines;
CA 2010 W 10 11 12 13 14 15 16 17 18 19 20
CA 2010 B 30 31 32 33 34 35 36 37 38 39 40
CA 2010 H 50 51 52 53 54 55 56 57 58 59 60
CA 2011 W 44 44 44 44 44 44 44 44 44 44 44
CA 2011 B 22 22 22 22 22 22 22 22 22 22 22
CA 2011 H 88 88 88 88 88 88 88 88 88 88 88
;
run;

proc print;run;

%macro transposer;
	proc sql noprint;
		select distinct Race 
			into :RaceList separated by " "
			from work.have;
			
		select distinct cats(Race, "_:")
			into :RaceRetain separated by " "
			from work.have
		;
			
		select cats("#_", Name)
			into :Values separated by " "
			from sashelp.vcolumn
				where libname = "WORK" and memname = "HAVE" 
					and Name like "F^_%" escape "^"
		;
	quit;

	data work.want;
		set work.have;
		by State Year;
		
		/* Caution: the sysfunc-line must not end with ; */
		array target
		%do i = 1 %to %sysfunc(countw(&RaceList));
			%sysfunc(translate(&Values, %scan(&RaceList, &i), %str(#)))
		%end;
		; /* for array target */
		
		array ori F_:;
		
		retain raceCount &raceRetain;
		
		/* Reseting retained variables */
		if first.Year then do;
			do i = 1 to dim(target);
				target[i] = .;
			end;
			raceCount = 0;
		end;
		
		raceCount = raceCount + 1;
		
		/* copy the values */
		do i = 1 to dim(ori);
			target[i+((raceCount-1)*dim(ori))] = ori[i];
		end;
	
		/* output after last Race of State/Year has been processed. */
		if last.Year then output;
		
		drop Race F_: i raceCount;
	run;
	

%mend;

%transposer;

proc print;run;
	

 

kevinmc87
Obsidian | Level 7

I'm moving everything to one row because of dataset I am working with that has the data constructed in a similar fashion, and I need to merge the datasets. I guess I change the data structure from the other dataset to match this one, and that would probably make things easier. 

Reeza
Super User

I would consider what I need to do after and decide what structure will best facilitate that analysis and then move towards that. If you post some of the questions you're trying to answer people can make suggestions. And if it turns out a different method works, the beauty of code is you can change it fairly easily 🙂

Ksharp
Super User
Check the merge skill proposed by Me,Matt, Arthur.T :

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf




data work.have;
	length STATE $ 2 YEAR 8 RACE $ 1 F_0-F_10 8;	
	input State Year Race F_0-F_10;
	datalines;
CA 2010 W 10 11 12 13 14 15 16 17 18 19 20
CA 2010 B 30 31 32 33 34 35 36 37 38 39 40
CA 2010 H 50 51 52 53 54 55 56 57 58 59 60
CA 2011 W 44 44 44 44 44 44 44 44 44 44 44
CA 2011 B 22 22 22 22 22 22 22 22 22 22 22
CA 2011 H 88 88 88 88 88 88 88 88 88 88 88
;
run;

data temp(index=(x=(State Year)));
 set have;
 by State Year;
 if first.year then n=0;
 n+1;
run;
proc sql;
select distinct catt('temp(where=(n=',n,')
 rename=(F_0-F_10=',race,'_F_0-',race,'_F_10))') 
  into : list separated by ' '
   from temp;
quit;
data want;
 merge &list;
 by State Year;
 drop n race;
run;

PGStats
Opal | Level 21

Nice idea @Ksharp. But why not read dataset have directly? That way the order of race does not matter

 

data work.have;
	length STATE $ 2 YEAR 8 RACE $ 1 F_0-F_10 8;	
	input State Year Race F_0-F_10;
	datalines;
CA 2010 W 10 11 12 13 14 15 16 17 18 19 20
CA 2010 B 30 31 32 33 34 35 36 37 38 39 40
CA 2010 H 50 51 52 53 54 55 56 57 58 59 60
CA 2011 W 44 44 44 44 44 44 44 44 44 44 44
CA 2011 B 22 22 22 22 22 22 22 22 22 22 22
CA 2011 H 88 88 88 88 88 88 88 88 88 88 88
;

proc sql;
select distinct catt('have(where=(race="',race,'")
 rename=(F_0-F_10=',race,'_F_0-',race,'_F_10))') 
  into : list separated by ' '
   from have;
quit;

data want;
 merge &list;
 by State Year;
 drop race;
run;
PG
Ksharp
Super User
@PG,
yeah, But if there are many group variables , that would make that macro variable very very longer and over the limit of 
the max length of macro variable. and create many sub-tables that I would like to avoid to .


Ksharp
Super User
@PG,
You are right. My code wouldn't work if there are some missing levels in a group.
Your code is right . 
Thanks.


kevinmc87
Obsidian | Level 7

Thanks everyone! 

PGStats
Opal | Level 21

@rogerjdeangelis The listserv requires a login. Please copy the SAS solution here.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 4823 views
  • 4 likes
  • 6 in conversation