Solved: Re: Create variable names from variable values

reini · Posted 10-19-2018 05:17 AM

Hello everybody,

I'm struggeling with some awfully structured data set that I would like to reorder in a reasonable way.

The data I got looks somewhat like this

data unstructured;
   input subject $3. dep_code_1 $3. quantity_1 dep_code_2 $3. quantity_2 dep_code_3 $3. quantity_3;
   datalines;
   s1 a1 12 a3 63 a2 17
   s2 a3 2 a1 11    .
   s3 a1 14 a3 7    .
   ;
run;

Now here I have some departments which are internally coded as a1 a2 and a3 and some subjects s1-s3. Now for each quantity there is a line where first you see the department code followed by the quantity associated. Now in each line there are only those departments listed that are concerned by the given subject.

(In addition there are further information about each subject per line, which I left out here).

As you can see this structure is kind of cumbersome and I's like to reformat it so somehwat like this

data structured;
   input subject $3. quantity_a1 quantity_a2 quantity_a3;
   datalines;
   s1 12 17 63
   s2 11 2 .
   s3 14 . 7
   ;
run;

Now I have the problem that e.g. I would need to read the value of dep_code_1 and of quantity_1 to do soemthing like

cats("quantity_a",dep_code_1) = quantity_1

now of course this doesn't work, since cats("quantity_a",dep_code_1) does not define a proper variable name.

I also tried to use a do loop and symput like this:

data structured;

set unstructured;

do i = 1 to 3;

call symput("category_name", "quantity_a"||dep_code_&i);

&category_name = quantity_&i;

end;

run;

but this again didn't work since the macro variable category_name seems not to be accesible while inside this data step.

Does anybody have an idea about how this can be done?

Thanks a lot in advance

RW9 · Posted 10-19-2018 05:44 AM

Step 1 for me would be, as always, to get away from the Excel thinking. Get the data normalised:

data unstructured;                                                                         
  input subject $3. dep_code_1 $3. quantity_1 dep_code_2 $3. quantity_2 dep_code_3 $3. quantity_3;                          
datalines;                                                                          
s1 a1 12 a3 63 a2 17
s2 a3 2 a1 11    .
s3 a1 14 a3 7    .
;
run;

data want (keep=subject department value);
  set unstructured;
  length department $10;
  array d{*} dep_:;
  array q{*} quantity_:;
  do i=1 to dim(d);
    department=d{i};
    value=q{i};
    if department ne "" then output;
  end;
run;

From there you can manipulate the data as its no longer part of the structure, but in the data part where it is supposed to be.

View solution in original post

RW9 · Posted 10-19-2018 05:44 AM

Step 1 for me would be, as always, to get away from the Excel thinking. Get the data normalised:

data unstructured;                                                                         
  input subject $3. dep_code_1 $3. quantity_1 dep_code_2 $3. quantity_2 dep_code_3 $3. quantity_3;                          
datalines;                                                                          
s1 a1 12 a3 63 a2 17
s2 a3 2 a1 11    .
s3 a1 14 a3 7    .
;
run;

data want (keep=subject department value);
  set unstructured;
  length department $10;
  array d{*} dep_:;
  array q{*} quantity_:;
  do i=1 to dim(d);
    department=d{i};
    value=q{i};
    if department ne "" then output;
  end;
run;

From there you can manipulate the data as its no longer part of the structure, but in the data part where it is supposed to be.

reini · Posted 10-19-2018 07:05 AM

Well, what you said about Excel is quite a good advice I think, yet it's hard to change your own thought processes. Especially when they're established for a long time.

Anyway your code seems to work excellent and though it's not what I had in mind when I posed my question, I actually should be able to do any subsequent assignments with the data structured that way. So thanks a lot 🙂

Kurt_Bremser · Posted 10-19-2018 06:13 AM

Or use two transposes:

proc transpose data=unstructured out=int1 (drop=_name_ rename=(col1=department));
by subject;
var dep_code:;
run;

proc transpose data=unstructured out=int2 (drop=_name_ subject rename=(col1=quantity));
by subject;
var quantity:;
run;

data want;
merge
  int1
  int2
;
if department ne '';
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

reini · Posted 10-19-2018 09:04 AM

Thanks a lot for this alternative approach 🙂

Create variable names from variable values

Re: Create variable names from variable values

Re: Create variable names from variable values

Re: Create variable names from variable values

Re: Create variable names from variable values

Re: Create variable names from variable values

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!