BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kisumsam
Quartz | Level 8

I'm trying to read in this set of data and I need some help:

 

A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000

 

I need the output to look like this:

 

image 12.png

I have come up with this solution which works fine:

 

Data Yourdata (drop=temp);
Infile datalines dsd;
retain var1-var3;
input temp : $10. @;
if length(temp) ^= 1 then do;
Input @1 var1 $10.
      @12 var2 $
      var3 $
      var4
      var5 $
      var6 : date9.
      var7 $
      var8 $
      var9
      var10 $
      var11 ;
end; else do;
Input @1 var4
      var5 $
      var6 : date9.
      var7 $
      var8 $
      var9
      var10 $
      var11 ;
end;
format var6 date9.;
datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;

However, I'm partially using formatted input instead of list input and I'm ignoring the delimiter for some of the columns. I am wondering if there is a better solution for this.

 

Can anyone help? Thanks 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Here is one possible alternative:

Data Yourdata;
  infile datalines dsd delimiter=',';
  informat var1 $10.;
  informat var2 var3 $1.;
  informat var4 8.;
  informat var5 $1.;
  informat var6 date9.;
  informat var7 var8 $1.;
  informat var9 8.;
  informat var10 $2.;
  informat var11 8.;
  retain var1-var3;
  input @;
  if length(scan(_infile_,1,',')) ^= 1 then do;
    Input var1
          var2
          var3;
  end;
  else do;
    input var4
      var5
      var6
      var7
      var8
      var9
      var10
      var11 ;
    output;
  end;
  format var6 date9.;
  datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;

Art, CEO, AnalystFinder.com

View solution in original post

8 REPLIES 8
Shmuel
Garnet | Level 18

Assuming your input data is in a consistent format, that is - always 3 lines and all variables exist (even if some of them have missing value) then you can read the 3 lines with one INPUT staement, in a format of:

  input
    <first line variables> /
    <second line variables> /
    <third line variables>
 ;

The slash (/) tells sas to continue reading from next line.

kisumsam
Quartz | Level 8

No unfortunately I need the first three columns to "retain" to the next line for some of the rows:

 

image 13.png

 

I don't think the slashes would work. Also, the DSD option already changes the delimiter to a comma. I don't think the DLM="," option is needed...

Shmuel
Garnet | Level 18

OK, I see - you have a prefix line to concatenate with next one or more lines,

 where each of them - except the prefix line - creates an output observation.

 

Then your code is perfct. I wouldn't change it.

kisumsam
Quartz | Level 8

No it goes like this:

 

Raw data:

A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0

 

Desired output:
A123456789 A 3 1 Y 15Feb1960 M M 3 FT 55000
A123456789 A 3 2 N 3Jun1965 F M 3 UE 0

 

The first three columns are "header" columns and they should retain to the next line for some of the rows.

kisumsam
Quartz | Level 8

That's correct 🙂

 

I don't like the fact that some columns are read in using formatted input when the data are separated by a comma:

 

Input @1 var1 $10.
      @12 var2 $
      var3 $
      var4

 

The VAR1 column had to be re-read" in since I had the TEMP column there. Re-reading data cannot be done using list input and the only solution I can think of is to use the formatted input.

 

Just curious if there is a better way around...

Shmuel
Garnet | Level 18
Allso add option dlm=',' to the infile staement:
Infile datalines dsd dlm=',';
art297
Opal | Level 21

Here is one possible alternative:

Data Yourdata;
  infile datalines dsd delimiter=',';
  informat var1 $10.;
  informat var2 var3 $1.;
  informat var4 8.;
  informat var5 $1.;
  informat var6 date9.;
  informat var7 var8 $1.;
  informat var9 8.;
  informat var10 $2.;
  informat var11 8.;
  retain var1-var3;
  input @;
  if length(scan(_infile_,1,',')) ^= 1 then do;
    Input var1
          var2
          var3;
  end;
  else do;
    input var4
      var5
      var6
      var7
      var8
      var9
      var10
      var11 ;
    output;
  end;
  format var6 date9.;
  datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;

Art, CEO, AnalystFinder.com

kisumsam
Quartz | Level 8
Awesome! That's exactly what I was looking for. Thanks!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2081 views
  • 1 like
  • 3 in conversation