BookmarkSubscribeRSS Feed
redbridge
Calcite | Level 5

Hi all,


I need to read an excel file with the following layout

pagenum          name          format          length          start          output

Page 1             

                           A               numeric         8                  1               543

                           B                alpha            13                9               abcdef

                           C                alpha            13                22             fedgs

                           D               alphanum      13                35             C012

                           E               alpha             13                48             rslghdlrfhdsr

                           F                numeric         8                 61              323

                           V1              numeric          6               67               110011

Page 2

                           G               numeric         8                  1               8543

                           H               numeric          8                9               500032

                           C                alpha            13                17             fedgsrre

                           J               alphanum      13                30             C014

                           K               alpha             13                43             ghdlrfhdsrd

                           F                numeric         8                 56              6323

                           I                 alpha             13               64             fkfifffffd

                           V2              numeric          7               77              0110011

Page3

                          etc etc etc

i.e. For Page 1, there are 7 variables (A, B, C, D, E, F, V1) and the corresponding values for these variables are shown in "output"

      For Page 2, there are 8 variables (G, H, C, J, K, F, I, V2) and the corresponding values for these variables are shown in "output"

      In each page, there is a variable called "V" etc that corresponds to the flag (1 for yes, 0 for no) for each variable within that page. If the page has 6 variables excluding the "V", the V1 will be 6 characters long and each position contains the answer (1 or 0) to the corresponding variables (A=1, B=1, C=0, D=0, E=1, F=1)

The variable "C" for page 2 is different from the variable "C" for page 1. It also includes the format (alpha, alphanumeric, numeric) for all these variables.

The Column "start" corresponds to the starting column of that variable as if it is using the "put" statement (I do not think it is important).

How can I read it into SAS ? I can manipulate the excel so that Pagenum contains "Page 1" etc for each row so I can create a new column later to have a unique variable names (e.g. page1_C, page2_C) . I also need to assign the corresponding V_values for each variable within the page (proc transpose on the "V" variables within the page and create a column of 1 and 0?

I will have 2 files (with the same layout) and I want to compare if the "results" are the same for each variable and do some statistics between them.

The desired output would be like

pagenum                    name          uniqueId          format          length                   output                VFlag

Page 1                           A             Page1_A       numeric         8                             543                    1

Page 1                           B              Page1_B        alpha            13                             abcdef               1   

Page 1                           C             Page1_C         alpha            13                           fedgs                 0

Page 1                           D             Page1_D        alphanum      13                           C012                  0

Page 1                           E             Page1_E        alpha             13                            rslghdlrfhdsr      1

Page 1                           F               Page1_F       numeric         8                              323                    1

Page 2                           G             Page2_G        numeric         8                               8543                  0

Page 2                           H             Page2_H       numeric          8                              500032               1

Page 2                           C              Page2_C        alpha            13                           fedgsrre            1

Page 2                           J              Page2_J       alphanum      13                           C014                 0

Page 2                           K              Page2_K       alpha             13                           ghdlrfhdsrd       0

Page 2                           F               Page2_F       numeric         8                             6323                 1

Page 2                           I               Page2_I        alpha             13                          fkfifffffd              1

etc...

2 REPLIES 2
Ksharp
Super User

Couldn't you use PROC IMPORT to import this file . and you don't post what output do you want .

Ksharp
Super User

Suppose you have imported the xls file into sas successfully.

data have;
infile datalines truncover;
input pagenum   & $       name    $      format   $       length          start          output : $20.;
datalines;
Page 1             
.                           A               numeric         8                  1               543
.                           B                alpha            13                9               abcdef
.                           C                alpha            13                22             fedgs
.                           D               alphanum      13                35             C012
.                           E               alpha             13                48             rslghdlrfhdsr
.                           F                numeric         8                 61              323
.                           V1              numeric          6               67               110011
Page 2
.                           G               numeric         8                  1               8543
.                           H               numeric          8                9               500032
.                           C                alpha            13                17             fedgsrre
.                           J               alphanum      13                30             C014
.                           K               alpha             13                43             ghdlrfhdsrd
.                           F                numeric         8                 56              6323
.                           I                 alpha             13               64             fkfifffffd
.                           V2              numeric          7               77              0110011
;
run;
data want(drop=_pagenum);
 set have;
 length _pagenum uniqueId    $ 20;
 retain _pagenum;
 if not missing(pagenum) then _pagenum=pagenum;
  else do;
        pagenum=_pagenum;
        uniqueId=catx('_',pagenum,name);
        output;
       end;
run;
data want(drop=i _v);
length  _v $ 20;
 do until(last.pagenum);
  set want;
  by pagenum;
 end;
 _v=output; i=0;
 do until(last.pagenum);
  set want;
  by pagenum;
  i+1;
  VFlag=char(_v,i);
  if not last.pagenum then output;
 end;
run;

Ksharp

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1273 views
  • 0 likes
  • 2 in conversation