BookmarkSubscribeRSS Feed
RebeccaFaye
Calcite | Level 5

Hi there! 

I need help using the input statement with line pointers. When I run my code it seems to be leaving out the last subject in my dataset. We are using a txt file that has 70 subjects, each baseline, and post-intervention systolic blood pressure and weight values. I've created 69 rows that have sbp1 sbp2 wt1 and wt2 but the 70th subject seems to be missing. Any help would be appreciated. 

 

CODE: 

 

data HW2.hw2_saved;
infile 'C:\Users\rebec\Documents\BS 805\HW 2\HW2data.txt';
input #1 id 1-2 drug 3 sbp1 4-11 wt1 12-19 @@
#2 sbp2 4-11 wt2 12-19 @@
#3;
run;

6 REPLIES 6
Tom
Super User Tom
Super User

Your math on the number of lines seems off.  If you have 69 subjects and are reading 3 lines each time because of the #3 in the INPUT statement then your file needs to have 69*3= 187 lines and not 70 lines.

Also why do you have the @@ in the input statement?  What do you think that is trying to do?  

RebeccaFaye
Calcite | Level 5
Thanks for the reply!
So I should have 70 subjects and 70 lines. Originally it had 140 lines (two for each subject) that had before and after sbp and weight. I'm trying to create a single row for each subject that has their baseline and post-intervention values for both sbp and weight. My code above seems to do this, but it just doesn't include my final subject (subject 70). I have 69 lines currently.
My understanding is that including "@@" , the pointer is not moved to the next row until there is no more data on the current row.
Tom
Super User Tom
Super User

The @@ is for when you want to read from the same input line across iterations of the data step.  It does not make any sense to include that along with #nn command to read multiple lines at a time.

 

It is hard to tell from your description what you are doing, but it does sound like you are creating both the source text file and the program to read it. In that case you can design how to make the text file so that it is easy to read.  It is much easier to just read using list mode input statement (no column numbers or line pointers or cursor movement needed).  Just separate each value by a space, and if there are any missing values then type a period.

 

If you have repeated measures it might be easier to first read them into multiple observations.  Include a variable to indicate which timepoint (baseline, etc) this particular line of data represents.  You can then later use code to collapse the multiple observations for a subject into a single observation (if you need to).

 

Example:

id visit date sbp dbp
1 baseline 2020/01/01 130 80
1 visit1 2020/01/05 135 85
2 baseline 2020/01/02 120 70
2 visit1 2020/01/04 . .

Then to read that just do something like:

data want;
  infile 'myfile.txt' truncover firstobs=2;
  length id 8 visit $20 date sbp dbp 8;
  informat date yymmdd.;
  format date yymmdd10.;
  input id visit date sbp dbp;
run;
RebeccaFaye
Calcite | Level 5

Yeah, I think there is some miscommunication. I'm using this text file, which has 70 subjects. id is two columns wide, drug is 1 column wide, systolic is 8.4 and weight is 8.4. As you can see below, there are two values of sbp and wt for each subject. I'm trying to code it so that each subject has wt1, wt2, sbp 1, sbp2. The code I have in my question does this except for the last person (subject 70). 

 

1 0159.5681326.4100
1 0151.9329293.6876
2 0165.4389300.5860
2 0150.1996283.7959
3 0171.4973280.8458
3 0151.1499259.6988
4 0165.4576286.6310
4 0141.5292266.9363
5 0159.3755334.0693
5 0142.7102321.2893
6 0166.5624293.6374
6 0132.2037283.6696
7 0169.0907343.9146
7 0141.9615323.3449
8 0163.5120304.5511
8 0152.7732290.4707
9 0166.2658261.5245
9 0149.9705230.7379
100169.7736304.6510
100156.5528276.4445
110169.1036304.9042
110152.4192286.9123
120170.2859330.0572
120157.6702303.2747
130161.8547300.8148
130142.0642271.2437
140162.5493296.7611
140157.0179261.5387
150172.0278326.7975
150152.6488291.0720
160163.7540293.8948
160141.5212273.8226
170162.3134320.7320
170152.7504290.6913
180159.2775271.4223
180132.4960240.5594
190167.2594303.5583
190157.9902286.2519
200170.7307296.1399
200169.6361266.6828
210162.1713333.3012
210159.1371306.1875
220162.3564296.5399
220156.5967263.7341
230152.7704274.2653
230150.7561240.2737
240173.6744259.8890
240152.2780243.6764
250166.0481334.5533
250157.7285311.3080
260160.4775301.0612
260142.2116274.6485
270166.6046343.6577
270136.9973311.4744
280167.1002264.2434
280156.1143246.0969
290169.7849303.6547
290142.9119286.8014
300166.4623319.8161
300142.2134283.0690
310166.7051283.6408
310153.9403260.3265
320162.9531299.2106
320152.0810271.1111
330166.0902343.9262
330156.6240311.2010
340166.8419296.1088
340147.9662281.9905
350162.2641294.7835
350156.2530274.8917
361162.8022311.7296
361128.1358296.0558
371164.2243330.0920
371135.2787311.0607
381162.7693281.4027
381128.2672260.5932
391166.3379330.8296
391125.1558303.4728
401166.3239306.0454
401134.7172279.9112
411161.6041323.5907
411131.2541300.7474
421161.1067263.3350
421117.9163234.8007
431168.4125263.4290
431127.6125246.3988
441172.6001316.2644
441151.8035284.2606
451162.5576299.5690
451134.8550274.7804
461161.2977346.9898
461151.0471321.8582
471166.0088254.4366
471141.5393246.7747
481168.6832326.1522
481117.7683290.8532
491168.2362260.2254
491145.7905221.4781
501168.3405286.7257
501138.5886251.2554
511161.8059320.6078
511135.8792306.8499
521162.6127290.8030
521117.5527259.0737
531162.2732330.1135
531137.8331293.8114
541165.2403320.4246
541151.1756300.1615
551169.9302310.9467
551145.9754280.5010
561161.4438309.0915
561144.2470286.4452
571161.9408279.3986
571137.3766263.4707
581168.0201279.5537
581127.7842251.8098
591168.7627279.2311
591117.7954253.5130
601163.1336319.3774
601146.4423294.7547
611161.4851289.2949
611137.9098260.5804
621160.7298263.7955
621127.0981241.2411
631168.2941326.0085
631127.4673296.0329
641168.7856334.6582
641131.4100313.1587
651168.0534323.0171
651127.9741293.9351
661161.5909313.5617
661136.0589293.9657
671162.8097306.7994
671135.2356280.4640
681165.3995313.8665
681117.7197294.2240
691161.5544336.9836
691138.0893324.4739
701165.2871291.7631
701144.2423264.4141

Tom
Super User Tom
Super User

Quick answer is your program was reading 3 lines at a time instead of the 2 lines that you data is using. So when it tried to read the last observation the data step stopped at the INPUT statement because it read past the end of the input file. So the last observation is never written to the output dataset.

 

Personally I would just read the file line by line.  You can add a new variable to indicate which line is is for a particular id.  

data want;
  infile txt ;
  input id $2. drug $1. sbp 8. dbp 8. ;
  if id ne lag(id) then rep=1;
  else rep+1;
run;

Results:

NOTE: 140 records were read from the infile TXT.
      The minimum record length was 19.
      The maximum record length was 19.
NOTE: The data set WORK.WANT has 140 observations and 5 variables.

First few observations:

Obs    id    drug      sbp        dbp      rep

  1    1      0      159.568    326.410     1
  2    1      0      151.933    293.688     2
  3    2      0      165.439    300.586     1
  4    2      0      150.200    283.796     2
  5    3      0      171.497    280.846     1
  6    3      0      151.150    259.699     2

There is perhaps something wrong here as those pressure readings look wrong.  You said that they values were 8 positions wide.  But perhaps the values are something other than human blood pressure readings?

 

If you did want to read two lines at once you could just use the / cursor movement to go to the next line.

data want;
  infile txt ;
  input id $2. drug $1. sbp 8. dbp 8. 
      / id2 $2. drug2 $1. spb2 8. dbp2 8.
  ;
  if id ne id2 or drug ne drug2 then do;
    put 'ERROR: There is a mismatch between the two lines. ' _n_= id= id2= drug= drug2= ;
    delete;
  end;
run;

Results:

NOTE: 140 records were read from the infile TXT.
      The minimum record length was 19.
      The maximum record length was 19.
NOTE: The data set WORK.WANT has 70 observations and 8 variables.

Values:

Obs    id    drug      sbp        dbp      id2    drug2      spb2       dbp2

  1    1      0      159.568    326.410     1       0      151.933    293.688
  2    2      0      165.439    300.586     2       0      150.200    283.796
  3    3      0      171.497    280.846     3       0      151.150    259.699
  4    4      0      165.458    286.631     4       0      141.529    266.936
  5    5      0      159.376    334.069     5       0      142.710    321.289
  6    6      0      166.562    293.637     6       0      132.204    283.670

 

ballardw
Super User

Or: I use data lines but the input statement should work for your file as well. The / says "read from next line" in effect.

data example;
input  id 1-2 drug 3 sbp1 4-11 wt1 12-19 
     / sbp2 4-11 wt2 12-19
;
datalines;
1 0159.5681326.4100
1 0151.9329293.6876
2 0165.4389300.5860
2 0150.1996283.7959
3 0171.4973280.8458
3 0151.1499259.6988
4 0165.4576286.6310
4 0141.5292266.9363
5 0159.3755334.0693
5 0142.7102321.2893
6 0166.5624293.6374
6 0132.2037283.6696
7 0169.0907343.9146
7 0141.9615323.3449
8 0163.5120304.5511
8 0152.7732290.4707
9 0166.2658261.5245
9 0149.9705230.7379
;

As and aside, it is a good idea to post examples of text files into a code box opened using the forum's {I} icon as message window will often remove white space, which could change the column positions if there are multiple adjacent spaces.

And if there are tab characters you need to tell us as they are not visible and it may not be possible to tell from the appearance of the text.

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 466 views
  • 0 likes
  • 3 in conversation