BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello

What are the ways to restructure the data from wide to long structure.

Is it possible to use function VNAME here?

IF we use proc transose then is it essential to perform 2 proc tranpose or we can do it in one step?

thanks

Data have;
Input PatientID Wage_2101 Wage 2104 Wage 2107 Staus_2101 Status_2104 Status_2107;
Cards;
1 100 150 80 200 2 2 2 1
2 200 180 300 250 1 1 1 1 
;
Run;

Data want;
Input PatientID YYMM Var_name value;
Cards;
1 Wage 100
1 Wage 150
1 Wage 80
1 Wage 200
1 Staus 2
1 Staus 2
1 Staus 2
1 Staus 1
2 Wage 200
2 Wage 180
2 Wage 300
2 Wage 250
2 Wage 1
2 Staus 1
2 Staus 1
2 Staus 1
;
Run;
1 ACCEPTED SOLUTION

Accepted Solutions
tarheel13
Rhodochrosite | Level 12

@Kurt_Bremser's approach is best. Here is code for how to do it.

proc transpose data=have out=have_t;
	by patientid;
run;

data want(rename=(col1=value));
	set have_t;
	yymm=scan(_name_,2,'_');
	var_name=scan(_name_,1,'_');
	
	drop _name_;
run;

View solution in original post

5 REPLIES 5
Patrick
Opal | Level 21

Below should work.

Thanks for the data steps creating sample data ...but please next time try to post something that doesn't require fixing.

data have;
  input PatientID Wage_2101 Wage_2104 Wage_2107 Staus_2101 Status_2104 Status_2107;
  datalines;
1 100 150 80 200 2 2 2 1
2 200 180 300 250 1 1 1 1 
;

proc sql noprint;
  select name into :varlist separated by ' '
  from dictionary.columns
  where 
    libname='WORK'
    and memname='HAVE'
    and upcase(name) ne 'PATIENTID'
  ;
quit;

data want;
  set have;
  array vars {*} &varlist;
  drop &varlist _:;
  length varname $32;

  do _i=1 to dim(vars);
    varname =prxchange('s/^(.*)_\d{4}$/$1/oi',1,strip(vname(vars[_i])));
    date    =input(scan(vname(vars[_i]),-1,'_'),yymmn4.);
    value   =vars[_i];
    format date date9.;
    output;
  end;
run;
Ronein
Meteorite | Level 14

Thanks,
3 questions please:
1- Whay Varname=vname(vars[_i]) is not enought and you write the long code
varname =prxchange('s/^(.*)_\d{4}$/$1/oi',1,strip(vname(vars[_i])))
2- Why do you write scan with -1 and not +1 ? As I understand you take the date YYMM that is right to _
3-Is the location of DROP can be also below do loop (After End )?
Which variables you delete with _: (I understand that you delete all vars with start of _ but can you write its name to understand )
thanks

Tom
Super User Tom
Super User

@Ronein wrote:

Thanks,
3 questions please:
1- Whay Varname=vname(vars[_i]) is not enought and you write the long code
varname =prxchange('s/^(.*)_\d{4}$/$1/oi',1,strip(vname(vars[_i])))
2- Why do you write scan with -1 and not +1 ? As I understand you take the date YYMM that is right to _
3-Is the location of DROP can be also below do loop (After End )?
Which variables you delete with _: (I understand that you delete all vars with start of _ but can you write its name to understand )
thanks


I think the regular expression is trying to take the name up any suffix of a underscore followed by four 4 digits.   If the variable name does not have the suffix then VARNAME will be blank.

 

The negative index means to take the words counting from the end of the string instead of the beginning.  SCAN(x,-1,'_') means take the last word while scan(x,2,'_') means take the second word. If you have any names with two or more underscores the result will be different.

 

DROP statement just instructs the compiler about what variables will be output to the dataset.  So they are not executable hence they can be anywhere in the data step. 

 

Another way to get the suffix and remainder it so get the suffix first and then use SUBSTRN() to take the other part. That method will work even when there are more than one delimiter in the name.

Example:

name='A_B_C';
suffix=scan(name,-1,'_');
base=substrn(name,1,length(name)-length(suffix)-1);
tarheel13
Rhodochrosite | Level 12

@Kurt_Bremser's approach is best. Here is code for how to do it.

proc transpose data=have out=have_t;
	by patientid;
run;

data want(rename=(col1=value));
	set have_t;
	yymm=scan(_name_,2,'_');
	var_name=scan(_name_,1,'_');
	
	drop _name_;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 666 views
  • 6 likes
  • 5 in conversation