Import row 1 as name of the variable and row 2 as label

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Import row 1 as name of the variable and row 2 as label

Is there a way to import the first row as the name of the variable, the second row as the label for the variable, and the third rows and beyond as the data?


Accepted Solutions
Solution
‎03-22-2017 11:24 AM
Grand Advisor
Posts: 9,568

Re: Import row 1 as name of the variable and row 2 as label

How about this one.



proc import datafile='/folders/myfolders/test.xlsx' out=have dbms=xlsx replace;
 getnames=no;
run;

data x data;
 set have;
 if _n_ in (1 2) then output x;
  else output data;
run;
proc transpose data=x out=temp;
 var _all_;
run;

data _null_;
 set temp end=last;
 if _n_=1 then call execute('proc datasets library=work nolist nodetails;modify data;');
 call execute(cat('label ',_name_,'="',col2,'";'));
 call execute(cat('rename ',_name_,'=',col1,';'));
 if last then call execute('quit;');
run;
proc print data=data label;run;


View solution in original post


All Replies
Grand Advisor
Posts: 10,196

Re: Import row 1 as name of the variable and row 2 as label

Not easily if that's what you are hoping for.

Are you going to have to do this frequently to different files of different structures?

Also what type of source file are you dealling with?

Occasional Contributor
Posts: 6

Re: Import row 1 as name of the variable and row 2 as label

The source file is excel.  I will not need to do this daily, but I will do it perioidically for survey data that I download from Qualtrics.

Grand Advisor
Posts: 10,196

Re: Import row 1 as name of the variable and row 2 as label

Here's how I would approach this.

1) Save the Excel file to CSV.

2) Make a two-row copy of the csv file to read just the varibles and labels.

3) Import the two row file.

4) Use the import wizard to import the main data file, make sure the get the names from the first row is checked and set the read data from row 3.

5) If you are going to be importing multiple files with the same layout I strongly suggest that you capture the datastep code from the log and save it for reuse. Check the lengths of character variables set by informat (and other informats assigned) you may want to change the lengths. Save the code. You can read the next file and just change the infile and data set names (assumes you'll save further sets as CSV. Rational: If you use Proc Import everytime as a minimum you will get different lengths for character variables. In some case you may get variables changing type from character to numeric or vice versa.

6) Use the data set from step 3 to 1) write a label statement to add to your program (hint: file print, and array processing); 2) write proc datasets code to set labels for the data set or 3) data step with call execute to actually run the  proc datasets code. If the number of variables is small I'd likely type the label statement into the datastep code.

 

This shows one way to write the label statement. The text will appear in the results or output, copy and paste into the editor. HAVE here is the data set from reading the first two rows so should have the same variable names as the main data.

data _null_;
   set have (obs=1);
   array c {*} _character_;
   length _name_ $ 32 string $ 200;;
   file print;
   put "Label";

   do i= 1 to dim(c);
      _name_ = vname(c[i]);
      string = quote(strip(c[i]));
      put  _name_ "=" string;
   end;
   put ";";
run;

 

 

Valued Guide
Posts: 505

Re: Import row 1 as name of the variable and row 2 as label

Creating SAS labels using the second row of an excel sheet


HAVE AN EXCEL SHEET (where the second row is the label)
=======================================================

   +-------+-----------------+----------------+
   |   A   |        B        |        C       |
   +-------+-----------------+----------------+
1  | SEX   |        NAME     |       AGEC     |
2  |Gender |    Student Name |    Student Age |
   +-------+-----------------+----------------+
3  | M     |      Alfred     |       14       |
4  | F     |      Alice      |       13       |
5  | F     |      Barbara    |       13       |
6  | F     |      Carol      |       14       |
7  | M     |      Henry      |       14       |
8  | M     |      James      |       12       |
9  | F     |      Jane       |       12       |
10 | F     |      Janet      |       15       |
11 | M     |      Jeffrey    |       13       |
12 | M     |      John       |       12       |
13 | F     |      Joyce      |       11       |
14 | F     |      Judy       |       14       |
15 | F     |      Louise     |       12       |
16 | F     |      Mary       |       15       |
17 | M     |      Philip     |       16       |
18 | M     |      Robert     |       12       |
19 | M     |      Ronald     |       15       |
20 | M     |      Thomas     |       11       |
21 | M     |      William    |       15       |
   +-------+-----------------+----------------+

WANT
====
         Variables in Creation Order

#    Variable    Type    Len    Label

1    SEX         Char      6    Gender
2    NAME        Char     12    Student Name
3    AGEC        Char     11    Student Age


Up to 40 obs from want total obs=19

Obs   NAME       SEX  AGEC

  1   Alfred      M    14
  2   Alice       F    13
  3   Barbara     F    13
  4   Carol       F    14
  5   Henry       M    14
  6   James       M    12
  7   Jane        F    12
  8   Janet       F    15
  9   Jeffrey     M    13
 10   John        M    12
 11   Joyce       F    11
 12   Judy        F    14
 13   Louise      F    12
 14   Mary        F    15
 15   Philip      M    16
 16   Robert      M    12
 17   Ronald      M    15
 18   Thomas      M    11
 19   William     M    15


WORKING CODE
============

   dosubl(%nrbquote(
     lbl=catx(" ",lbl,cats(vname(names[i]),"=",'"',names[i],'"'));
     call symputx("lbl",lbl);

   label &lbl;

FULL SOLUTION
=============

*                _                  _
 _ __ ___   __ _| | _____     __  _| |___
| '_ ` _ \ / _` | |/ / _ \____\ \/ / / __|
| | | | | | (_| |   <  __/_____>  <| \__ \
|_| |_| |_|\__,_|_|\_\___|    /_/\_\_|___/

;

%utlfkil(d:/xls/rowlbl.xlsx);
libname xel "d:/xls/rowlbl.xlsx";

data xel.class(drop=age);
  length sex $8 name agec $18;
  set sashelp.class(keep=name sex age);
  agec=put(age,3.);
  savname=name;
  savage =agec;
  savsex =sex;
  if _n_=1 then do;
       name="Student Name";
       sex="Gender";      ;
       agec="Student Age"  ;
       output;
       name = savname;
       agec = savage ;
       sex  = savsex ;
       output;
   end;
   else output;
   drop
      savname
      savage
      savsex;
run;quit;

libname xel clear;

* __       _ _                 _       _   _
 / _|_   _| | |      ___  ___ | |_   _| |_(_) ___  _ __
| |_| | | | | |_____/ __|/ _ \| | | | | __| |/ _ \| '_ \
|  _| |_| | | |_____\__ \ (_) | | |_| | |_| | (_) | | | |
|_|  \__,_|_|_|     |___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

/*
There are several solutions. You can read rectangles which
make it easier to have AGEC as a numeric.
You can get lengths using passthru
*/

libname xel "d:/xls/rowlbl.xlsx" scan_text=no header=no;

proc datasets lib=work kill;
run;quit;

%symdel nam lbl;

libname xel "d:/xls/rowlbl.xlsx";

data want;

 if _n_=0 then do;
    %let rc=%sysfunc(dosubl(%nrbquote(
        data _null_;
          set xel.class(firstobs=1 obs=1); * use "[sheet1$]" if you don't have a named range;
          array names _character_;
          length lbl $500;
          retain lbl;
          if _n_=1 then do;
              do i=1 to dim(names);
                lbl=catx(" ",lbl,cats(vname(names[i]),"=",'"',names[i],'"'));
              end;
              call symputx("lbl",lbl);
          end;
        run;quit;

    )));
  end;

  set xel.class(firstobs=2);

    length f: $8;
    format _all_;
    informat _all_;
    label &lbl;

run;quit;

libname xel clear;

Solution
‎03-22-2017 11:24 AM
Grand Advisor
Posts: 9,568

Re: Import row 1 as name of the variable and row 2 as label

How about this one.



proc import datafile='/folders/myfolders/test.xlsx' out=have dbms=xlsx replace;
 getnames=no;
run;

data x data;
 set have;
 if _n_ in (1 2) then output x;
  else output data;
run;
proc transpose data=x out=temp;
 var _all_;
run;

data _null_;
 set temp end=last;
 if _n_=1 then call execute('proc datasets library=work nolist nodetails;modify data;');
 call execute(cat('label ',_name_,'="',col2,'";'));
 call execute(cat('rename ',_name_,'=',col1,';'));
 if last then call execute('quit;');
run;
proc print data=data label;run;


Occasional Contributor
Posts: 6

Re: Import row 1 as name of the variable and row 2 as label

I did not try all solutions.  I tried this one and it worked great!  Thank you!

Grand Advisor
Posts: 9,568

Re: Import row 1 as name of the variable and row 2 as label

Maybe sometimes you don't want transform character into num  by hand, try this one.

 

 


proc import datafile='/folders/myfolders/test.xlsx' out=x dbms=xlsx replace;
 getnames=no;
run;


proc import datafile='/folders/myfolders/test.xlsx' out=data dbms=xlsx replace;
 getnames=no;
 datarow=3;
run;

data x ;
 set x(obs=2);
run;
proc transpose data=x out=temp;
 var _all_;
run;

data _null_;
 set temp end=last;
 if _n_=1 then call execute('proc datasets library=work nolist nodetails;modify data;');
 call execute(cat('label ',_name_,'="',col2,'";'));
 call execute(cat('rename ',_name_,'=',col1,';'));
 if last then call execute('quit;');
run;
proc print data=data label;run;
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 182 views
  • 2 likes
  • 4 in conversation