DATA Step, Macro, Functions and more

request for a solution

Reply
Occasional Contributor
Posts: 6

request for a solution

Hi All,

I have a following text(magnolia) file on my free sas university edition.

 

M. grandiflora      Southern Magnolia    80  15  E  White
M. Campbellii                                        80  20  D  Rose
M. Liliflora           Lily Magnolia             12    4   D  Purple
M. soulangiana   Saucer Magnolia       25    3   D  Pink
M. Stellata Star   Magnolia                   10    3   D  White

 

I used the following code to read that into sas but  would not work.The problem is that four rows under M. Campbelli (80,20,D Rose) would be missing in the output.I tried using whatever  options i could recall but failed.The result looked like this:

Obs

name

sciname

sale_qty

remnant

code

color

1

M. grandiflora

                Southern Magnolia              

80

15

E

White

2

M. Campbellii               

 

.

.

 

 

3

M. Liliflora

                Lily Magnolia              

12

4

D

Purple

4

M. soulangiana

                Saucer Magnolia              

25

3

D

Pink

5

M. Stellata

                Star Magnolia              

10

3

D

White

data flowers;
infile'/folders/myfolders/magnolia.txt' missover;
input name $ 1-14 sci_name $ 15-35 sale_qty 36-38  remnant 39-40 code$ 41-42 color $;
run;
proc print data=flowers;
run;

 

 

Any help would be greatly appreciated.

 

Thanks,

Contributor
Posts: 33

Re: request for a solution

I would try the truncover option either in place or with the missover option. You may need an lrecl option also if that doesn't work.

Occasional Contributor
Posts: 6

Re: request for a solution

Hi, truncover option yielded the same result as missover.

thank you though.

Contributor
Posts: 33

Re: request for a solution

Sorry it didn't work! I just double checked and the last infile I had issues with apparently I used all three options.

 

infile Main LRECL=800 MISSOVER TRUNCOVER ; 
Valued Guide
Posts: 505

Re: request for a solution

Looks like you have overlapping ranges.
Also note that 'list;' statement can help with ranges


data tst;
*input name $ 1-14 sci_name $ 15-35 sale_qty 36-38  remnant 39-40 code$ 41-42 color $;
input name $ 1-14 sci_name $ 15-40 sale_qty 41-45  remnant 46-50 code$ 51-53 color $;
list;
cards4;
M. grandiflora      Southern Magnolia    80   15   E  White
M. Campbellii                            80   20   D  Rose
M. Liliflora        Lily Magnolia        12    4   D  Purple
M. soulangiana      Saucer Magnolia      25    3   D  Pink
M. Stellata Star    Magnolia             10    3   D  White
;;;;
run;quit;

----+----1----+----2----+----3----+----4----+----5----+----6--- M. grandiflora Southern Magnolia 80 15 E White M. Campbellii 80 20 D Rose M. Liliflora Lily Magnolia 12 4 D Purple M. soulangiana Saucer Magnolia 25 3 D Pink M. Stellata Star Magnolia 10 3 D White
Valued Guide
Posts: 505

Re: request for a solution

Note if you use the more powerful old text editor you can type 'cols' in the prefix area to get a ruler or use fslist with nums on and hex on. You can even use ths command file to find position of tabs or any other non-printable characters, ie f '09'x. I find these and the many other old text editor type functions that are not available in any of the newer somewhat crippled editors.
Occasional Contributor
Posts: 6

Re: request for a solution

Hi there,

Thank you very much.

It worked when I did as per your  suggestion  using  the cards/datalines option.But how do I get the same result using the infile option?

thanks in advance.

Super User
Posts: 17,775

Re: request for a solution

Can you attach the txt file? Hard to say otherwise. 

Occasional Contributor
Posts: 6

Re: request for a solution

Hi Reeza,

 

Please find attached the text file.

 

thanks

Trusted Advisor
Posts: 1,368

Re: request for a solution

Did you mean:

    input @1    name            $14.  

              @15 sci_name     $20.

             @36 sale_qty           3.

             @39 remnant           2.

             @41color               $

      ;

 

 

 

Occasional Contributor
Posts: 6

Re: request for a solution

Did not work that way too.Thank you though.

Contributor
Posts: 33

Re: request for a solution

[ Edited ]

I poked around in the text file that was attached earlier in the thread and noticed a mixture of spaces and tabs which required some cleaning up. There are a hundred ways you could choose to clean it but I used a regular expression in notepad++ to throw pipes in there and then used the following code. The pipe delimited file is attached as well.

 

data flowers;
infile 'magnolia.txt' dsd dlm="|" missover truncover lrecl=999;
input
name  :$16.0
sci_name :$20.0
sale_qty :3.0
remnant :3.0
code :$2.0
color :$15.0;
run;
proc print data=flowers;
run;

 

Not sure if that's what you were after, or if you were looking for SAS code to work around the dirty file (I'm not sure if that's possible).

Trusted Advisor
Posts: 1,368

Re: request for a solution

Your code with slight cahange works fine:

 

data flowers;
infile'/folders/myfolders/magnolia.txt'  dlm='|' truncover;
input name $ 1-14 sci_name $ 15-35 sale_qty 36-38  remnant 39-40 code$ 41-42 color $;
run;
proc print data=flowers;
run;

 

Community Manager
Posts: 2,757

Re: request for a solution

Hi @inquistive

 

Your data file isn't well formed -- it doesn't have predictable spacing between data values nor a reliable delimiter.  Here's what it looks like with all of the characters visible.  Notice line 2 has 4 tabs instead of a "sci_name" value, and the "sale_qty" value for that line falls at column 29 instead of 36 (per your program code).

 

inq.png

 

@rogerjdeangelis actually had a good suggestion to use the LIST statement to see the column positions.  When you do that, it's easy to see that line 2 is not placed the way you expect.

 

32         data flowers;
33         infile 'c:\temp\magnolia.txt' missover;
34         input line $ 256;
35         list;
36         run;

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0                      

1   CHAR  M. grandiflora .Southern Magnolia .80 15 E White 48
    ZONE  422676666666762056776676246666666203323324256676
    NUMR  DE0721E496CF21093F54852E0D17EFC91098001505078945

2   CHAR  M. Campbellii... .80 20 D Rose 30
    ZONE  422466766666600020332332425676
    NUMR  DE031D025CC9999909800200402F35

3   CHAR  M. Liliflora   .Lily Magnolia     .12  4 D Purple 49
    ZONE  4224666666762220466724666666622222033223242577766
    NUMR  DE0C9C96CF210009C9C90D17EFC91000009120040400520C5

4   CHAR  M. soulangiana .Saucer Magnolia   .25  3 D Pink 47
    ZONE  42276766666666205676672466666662220332232425666
    NUMR  DE03F5C1E791E1093153520D17EFC9100092500304009EB

5   CHAR  M. Stellata    .Star Magnolia     .10  3 D White 48
    ZONE  422576666762222057672466666662222203322324256676
    NUMR  DE0345CC1410000934120D17EFC910000091000304078945

You can still use SAS to read this file as is, but should you?  I see that you are using SAS University Edition and maybe you are working through a learning exercise.  Is the purpose of this to learn how to write code to parse irregular data?  Or was there a transcription error in the process of creating the input file?  This community can help you either way, but we need more information about your objective.

Occasional Contributor
Posts: 6

Re: request for a solution

[ Edited ]

Hello Chris,

 

Thank you for your suggestion.

Truly, I am working through a learning exercise on UE.But the codes provided by all the community members are not yielding the ultimate result expected.

Of course,  line 2 has a missing (denoted with blank ) variable character value ( 'sci_name') and the 'sale_qty' (80) is also rightly under and over other corresponding values in the preceding and following values in the text file as in my orginal post. There is an attached  file shared with Reeza too if you would like to see.

Anyway, I am overwhelmed by the community support received and look forward to keeping in touch.

 

 

 

Ask a Question
Discussion stats
  • 16 replies
  • 669 views
  • 1 like
  • 7 in conversation