Can someone please help me input this style of space delimited txt file into SAS 9.2? Its a PLINK output. PLINK: Whole genome data analysis toolset</title></head><!--<html>--><!--<title>PLINK
Ive been trying to use proc import without luck.
proc import datafile="C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt" out=gwas dbms=dlm replace;
datarow=2;
getnames=yes;
run;
Ive attached part of the data.
The data looks like this. Its a space delimited txt file. About 4 million observations.
CHR SNP BP A1 TEST NMISS BETA STAT P
1 rs28659788 713170 G ADD 156 -3.549 -0.8538 0.3946
1 rs28659788 713170 G Age 156 -0.006926 -0.1199 0.9048
1 rs28659788 713170 G UV 156 -1.725 -1.369 0.1729
1 rs28659788 713170 G c1 156 10.05 2.434 0.01611
1 rs3094315 742429 T ADD 160 -0.3026 -0.3673 0.7139
1 rs3094315 742429 T Age 160 -0.02911 -0.5245 0.6007
1 rs3094315 742429 T UV 160 -1.579 -1.244 0.2153
1 rs3094315 742429 T c1 160 10.2 2.459 0.01502
1 rs3131972 742584 C ADD 160 -0.5456 -0.6454 0.5196
1 rs3131972 742584 C Age 160 -0.02799 -0.5048 0.6144
1 rs3131972 742584 C UV 160 -1.514 -1.189 0.2361
1 rs3131972 742584 C c1 160 10.18 2.46 0.01501
1 rs3131969 744045 C ADD 156 -0.6604 -0.7913 0.43
1 rs3131969 744045 C Age 156 -0.02938 -0.5326 0.5951
1 rs3131969 744045 C UV 156 -1.203 -0.9396 0.3489
1 rs3131969 744045 C c1 156 9.547 2.328 0.02123
1 rs12562034 758311 A ADD 157 -1.785 -1.208 0.2289
1 rs12562034 758311 A Age 157 -0.02333 -0.4124 0.6807
1 rs12562034 758311 A UV 157 -1.495 -1.17 0.244
1 rs12562034 758311 A c1 157 10.14 2.438 0.01593
1 rs12124819 766409 G ADD 160 -1.021 -0.5403 0.5898
1 rs12124819 766409 G Age 160 -0.02723 -0.4906 0.6244
1 rs12124819 766409 G UV 160 -1.678 -1.331 0.1852
If you have errors in the log post them.
What do you mean without luck? No data set or contents wrong/missing/ unexpected?
If the problem is unexpected output data types try adding
GuessingRows=32767
Thanks for the reply.
The error i get is below (Its very long)
Thanks in advance for your help.
634 proc import datafile="C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt" out=gwas dbms=dlm replace;
635 datarow=2;
636 getnames=yes;
637 run;
Number of names found is greater than number of variables found.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Name is not a valid SAS name.
Problems were detected with provided names. See LOG.
638 /**********************************************************************
639 * PRODUCT: SAS
640 * VERSION: 9.2
641 * CREATOR: External File Interface
642 * DATE: 07MAY13
643 * DESC: Generated SAS Datastep Code
644 * TEMPLATE SOURCE: (None Specified.)
645 ***********************************************************************/
646 data WORK.GWAS ;
647 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
648 infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt' delimiter = ' ' MISSOVER DSD lrecl=32767 firstobs=2 ;
649 informat VAR1 $1. ;
650 informat CHR $1. ;
651 informat VAR3 $1. ;
652 informat VAR4 best32. ;
653 informat VAR5 $1. ;
654 informat VAR6 $1. ;
655 informat VAR7 $10. ;
656 informat VAR8 $12. ;
657 informat VAR9 $1. ;
658 informat VAR10 $1. ;
659 informat VAR11 $1. ;
660 informat SNP best32. ;
661 informat VAR13 best32. ;
662 informat VAR14 $1. ;
663 informat VAR15 $1. ;
664 informat VAR16 $1. ;
665 informat VAR17 $1. ;
666 informat VAR18 $1. ;
667 informat VAR19 $1. ;
668 informat VAR20 $1. ;
669 informat BP $1. ;
670 informat VAR22 $1. ;
671 informat VAR23 $1. ;
672 informat A1 $3. ;
673 informat VAR25 $4. ;
674 informat VAR26 $3. ;
675 informat VAR27 $1. ;
676 informat VAR28 $1. ;
677 informat VAR29 $1. ;
678 informat VAR30 best32. ;
679 informat TEST best32. ;
680 informat VAR32 best32. ;
681 informat VAR33 best32. ;
682 informat VAR34 best32. ;
683 informat NMISS best32. ;
684 informat VAR36 best32. ;
685 informat VAR37 best32. ;
686 informat VAR38 best32. ;
687 informat VAR39 best32. ;
688 informat VAR40 best32. ;
689 informat VAR41 best32. ;
690 informat BETA best32. ;
691 informat VAR43 best32. ;
692 informat VAR44 best32. ;
693 informat VAR45 best32. ;
694 informat VAR46 best32. ;
695 informat VAR47 best32. ;
696 informat VAR48 best32. ;
697 informat VAR49 best32. ;
698 informat VAR50 best32. ;
699 informat STAT best32. ;
700 informat VAR52 best32. ;
701 informat VAR53 best32. ;
702 format VAR1 $1. ;
703 format CHR $1. ;
704 format VAR3 $1. ;
705 format VAR4 best12. ;
706 format VAR5 $1. ;
707 format VAR6 $1. ;
708 format VAR7 $10. ;
709 format VAR8 $12. ;
710 format VAR9 $1. ;
711 format VAR10 $1. ;
712 format VAR11 $1. ;
713 format SNP best12. ;
714 format VAR13 best12. ;
715 format VAR14 $1. ;
716 format VAR15 $1. ;
717 format VAR16 $1. ;
718 format VAR17 $1. ;
719 format VAR18 $1. ;
720 format VAR19 $1. ;
721 format VAR20 $1. ;
722 format BP $1. ;
723 format VAR22 $1. ;
724 format VAR23 $1. ;
725 format A1 $3. ;
726 format VAR25 $4. ;
727 format VAR26 $3. ;
728 format VAR27 $1. ;
729 format VAR28 $1. ;
730 format VAR29 $1. ;
731 format VAR30 best12. ;
732 format TEST best12. ;
733 format VAR32 best12. ;
734 format VAR33 best12. ;
735 format VAR34 best12. ;
736 format NMISS best12. ;
737 format VAR36 best12. ;
738 format VAR37 best12. ;
739 format VAR38 best12. ;
740 format VAR39 best12. ;
741 format VAR40 best12. ;
742 format VAR41 best12. ;
743 format BETA best12. ;
744 format VAR43 best12. ;
745 format VAR44 best12. ;
746 format VAR45 best12. ;
747 format VAR46 best12. ;
748 format VAR47 best12. ;
749 format VAR48 best12. ;
750 format VAR49 best12. ;
751 format VAR50 best12. ;
752 format STAT best12. ;
753 format VAR52 best12. ;
754 format VAR53 best12. ;
755 input
756 VAR1 $
757 CHR $
758 VAR3 $
759 VAR4
760 VAR5 $
761 VAR6 $
762 VAR7 $
763 VAR8 $
764 VAR9 $
765 VAR10 $
766 VAR11 $
767 SNP
768 VAR13
769 VAR14 $
770 VAR15 $
771 VAR16 $
772 VAR17 $
773 VAR18 $
774 VAR19 $
775 VAR20 $
776 BP $
777 VAR22 $
778 VAR23 $
779 A1 $
780 VAR25 $
781 VAR26 $
782 VAR27 $
783 VAR28 $
784 VAR29 $
785 VAR30
786 TEST
787 VAR32
788 VAR33
789 VAR34
790 NMISS
791 VAR36
792 VAR37
793 VAR38
794 VAR39
795 VAR40
796 VAR41
797 BETA
798 VAR43
799 VAR44
800 VAR45
801 VAR46
802 VAR47
803 VAR48
804 VAR49
805 VAR50
806 STAT
807 VAR52
808 VAR53
809 ;
810 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
811 run;
NOTE: The infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt' is:
Filename=C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt,
RECFM=V,LRECL=32767,
File Size (bytes)=381808598,
Last Modified=07May2013:15:53:16,
Create Time=07May2013:15:53:10
NOTE: Invalid data for VAR38 in line 6266 63-64.
NOTE: Invalid data for VAR49 in line 6266 76-77.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
6266 1 rs10489133 4573797 0 ADD 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs10489133 VAR8= VAR9= VAR10= VAR11=4 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=6265
NOTE: Invalid data for VAR38 in line 6267 63-64.
NOTE: Invalid data for VAR49 in line 6267 76-77.
6267 1 rs10489133 4573797 0 Age 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs10489133 VAR8= VAR9= VAR10= VAR11=4 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=6266
NOTE: Invalid data for VAR39 in line 6268 63-64.
NOTE: Invalid data for VAR50 in line 6268 76-77.
6268 1 rs10489133 4573797 0 UV 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs10489133 VAR8= VAR9= VAR10= VAR11=4 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=UV VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=160 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=6267
NOTE: Invalid data for VAR39 in line 6269 63-64.
NOTE: Invalid data for VAR50 in line 6269 76-77.
6269 1 rs10489133 4573797 0 c1 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs10489133 VAR8= VAR9= VAR10= VAR11=4 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=c1 VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=160 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=6268
NOTE: Invalid data for VAR38 in line 10186 63-64.
NOTE: Invalid data for VAR49 in line 10186 76-77.
10186 1 rs12065517 6588123 0 ADD 156 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs12065517 VAR8= VAR9= VAR10= VAR11=6 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=10185
NOTE: Invalid data for VAR38 in line 10187 63-64.
NOTE: Invalid data for VAR49 in line 10187 76-77.
10187 1 rs12065517 6588123 0 Age 156 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs12065517 VAR8= VAR9= VAR10= VAR11=6 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=10186
NOTE: Invalid data for VAR39 in line 10188 63-64.
NOTE: Invalid data for VAR50 in line 10188 76-77.
10188 1 rs12065517 6588123 0 UV 156 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs12065517 VAR8= VAR9= VAR10= VAR11=6 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=UV VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=156 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=10187
NOTE: Invalid data for VAR39 in line 10189 63-64.
NOTE: Invalid data for VAR50 in line 10189 76-77.
10189 1 rs12065517 6588123 0 c1 156 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs12065517 VAR8= VAR9= VAR10= VAR11=6 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=c1 VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=156 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=10188
NOTE: Invalid data for VAR37 in line 20342 63-64.
NOTE: Invalid data for VAR48 in line 20342 76-77.
20342 1 rs17039265 12835088 0 ADD 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs17039265 VAR8= VAR9= VAR10=1 VAR11= SNP=. VAR13=. VAR14=0 VAR15= VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22=A VAR23= A1= VAR25= VAR26= VAR27= VAR28=1 VAR29= VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=20341
NOTE: Invalid data for VAR37 in line 20343 63-64.
NOTE: Invalid data for VAR48 in line 20343 76-77.
20343 1 rs17039265 12835088 0 Age 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs17039265 VAR8= VAR9= VAR10=1 VAR11= SNP=. VAR13=. VAR14=0 VAR15= VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22=A VAR23= A1= VAR25= VAR26= VAR27= VAR28=1 VAR29= VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=20342
NOTE: Invalid data for VAR38 in line 20344 63-64.
NOTE: Invalid data for VAR49 in line 20344 76-77.
20344 1 rs17039265 12835088 0 UV 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs17039265 VAR8= VAR9= VAR10=1 VAR11= SNP=. VAR13=. VAR14=0 VAR15= VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=U A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=20343
NOTE: Invalid data for VAR38 in line 20345 63-64.
NOTE: Invalid data for VAR49 in line 20345 76-77.
20345 1 rs17039265 12835088 0 c1 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7=rs17039265 VAR8= VAR9= VAR10=1 VAR11= SNP=. VAR13=. VAR14=0 VAR15= VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=c A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=20344
NOTE: Invalid data for SNP in line 23674 13-17.
23674 1 rs549 15419412 A ADD 159 -0.05494 -0.0698 0.9444 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8= VAR9= VAR10= VAR11= SNP=. VAR13=. VAR14= VAR15=1 VAR16= VAR17=
VAR18= VAR19=A VAR20= BP= VAR22= VAR23= A1= VAR25= VAR26= VAR27=A VAR28= VAR29= VAR30=. TEST=. VAR32=. VAR33=159
VAR34=. NMISS=. VAR36=-0.05494 VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=-0.0698 VAR43=. VAR44=. VAR45=. VAR46=. VAR47=.
VAR48=. VAR49=0.9444 VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=23673
NOTE: Invalid data for SNP in line 23675 13-17.
23675 1 rs549 15419412 A Age 159 -0.03094 -0.5468 0.5853 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8= VAR9= VAR10= VAR11= SNP=. VAR13=. VAR14= VAR15=1 VAR16= VAR17=
VAR18= VAR19=A VAR20= BP= VAR22= VAR23= A1= VAR25= VAR26= VAR27=A VAR28= VAR29= VAR30=. TEST=. VAR32=. VAR33=159
VAR34=. NMISS=. VAR36=-0.03094 VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=-0.5468 VAR43=. VAR44=. VAR45=. VAR46=. VAR47=.
VAR48=. VAR49=0.5853 VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=23674
NOTE: Invalid data for SNP in line 23676 13-17.
23676 1 rs549 15419412 A UV 159 -1.662 -1.304 0.1943 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8= VAR9= VAR10= VAR11= SNP=. VAR13=. VAR14= VAR15=1 VAR16= VAR17=
VAR18= VAR19=A VAR20= BP= VAR22= VAR23= A1= VAR25= VAR26= VAR27= VAR28=U VAR29= VAR30=. TEST=. VAR32=. VAR33=.
VAR34=159 NMISS=. VAR36=. VAR37=. VAR38=. VAR39=-1.662 VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=-1.304 VAR47=.
VAR48=. VAR49=. VAR50=. STAT=. VAR52=. VAR53=0.1943 _ERROR_=1 _N_=23675
NOTE: Invalid data for SNP in line 23677 13-17.
23677 1 rs549 15419412 A c1 159 10.13 2.434 0.01609 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8= VAR9= VAR10= VAR11= SNP=. VAR13=. VAR14= VAR15=1 VAR16= VAR17=
VAR18= VAR19=A VAR20= BP= VAR22= VAR23= A1= VAR25= VAR26= VAR27= VAR28=c VAR29= VAR30=. TEST=. VAR32=. VAR33=.
VAR34=159 NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=10.13 VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=2.434
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=23676
NOTE: Invalid data for VAR38 in line 31730 63-64.
NOTE: Invalid data for VAR49 in line 31730 76-77.
31730 1 rs2236772 20177372 0 ADD 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8=rs2236772 VAR9= VAR10= VAR11=2 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=31729
NOTE: Invalid data for VAR38 in line 31731 63-64.
NOTE: Invalid data for VAR49 in line 31731 76-77.
31731 1 rs2236772 20177372 0 Age 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8=rs2236772 VAR9= VAR10= VAR11=2 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23=A A1= VAR25= VAR26= VAR27= VAR28= VAR29=1 VAR30=. TEST=. VAR32=. VAR33=.
VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=. VAR49=.
VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=31730
NOTE: Invalid data for VAR39 in line 31732 63-64.
NOTE: Invalid data for VAR50 in line 31732 76-77.
31732 1 rs2236772 20177372 0 UV 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8=rs2236772 VAR9= VAR10= VAR11=2 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=UV VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=160 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=31731
NOTE: Invalid data for VAR39 in line 31733 63-64.
NOTE: Invalid data for VAR50 in line 31733 76-77.
WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.
31733 1 rs2236772 20177372 0 c1 160 NA NA NA 90
VAR1= CHR= VAR3= VAR4=1 VAR5= VAR6= VAR7= VAR8=rs2236772 VAR9= VAR10= VAR11=2 SNP=. VAR13=. VAR14= VAR15=0 VAR16=
VAR17= VAR18= VAR19= VAR20= BP= VAR22= VAR23= A1=c1 VAR25= VAR26= VAR27= VAR28= VAR29= VAR30=160 TEST=. VAR32=.
VAR33=. VAR34=. NMISS=. VAR36=. VAR37=. VAR38=. VAR39=. VAR40=. VAR41=. BETA=. VAR43=. VAR44=. VAR45=. VAR46=. VAR47=. VAR48=.
VAR49=. VAR50=. STAT=. VAR52=. VAR53=. _ERROR_=1 _N_=31732
NOTE: 4150092 records were read from the infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear.txt'.
The minimum record length was 90.
The maximum record length was 91.
NOTE: The data set WORK.GWAS has 4150092 observations and 53 variables.
NOTE: DATA statement used (Total process time):
real time 14.57 seconds
cpu time 12.74 seconds
Here's a clue: Your variable name SNP appears in the informat list and input statement as the 12th variable and BP 9 more after than. I think you may have tabs or something else on the column heading line than the text displayed in your example, possibly a bunch of tabs or null characters.
When I looked at your first example file it is not space delimited.
I would be tempted to copy the first line of data file into a plain next editor and see what it looks like. Another option would be to use the SAS FSLIST tool to look at the file. It will show where things appear by column though null characters may be visible the column positions of your column headings in relation to data better than some other software.
Thanks Ballard. I have a attached the dataset to my first post. Im trying to learn how to deal with the strange space delimiter in the data file.
Do you have a recommendation?
Try importing it as TAB delimited. I opened the file in Word and it shows TABS separating the values.
Thanks Ballard. I tried imported as tab delimited and got this error. Any suggestions to get this to work?
915 proc import datafile="C:\Users\Adam Bress\Downloads\plink.assoc.linear" out=gwas dbms=tab replace;
916 datarow=2;
917 getnames=yes;
918 run;
Name CHR SNP BP A1 TEST NMISS BETA STAT P truncated to
_CHR__________SNP_________BP___A.
Problems were detected with provided names. See LOG.
919 /**********************************************************************
920 * PRODUCT: SAS
921 * VERSION: 9.2
922 * CREATOR: External File Interface
923 * DATE: 07MAY13
924 * DESC: Generated SAS Datastep Code
925 * TEMPLATE SOURCE: (None Specified.)
926 ***********************************************************************/
927 data WORK.GWAS ;
928 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
929 infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear' delimiter='09'x MISSOVER DSD lrecl=32767 firstobs=2 ;
930 informat _CHR__________SNP_________BP___A $87. ;
931 format _CHR__________SNP_________BP___A $87. ;
932 input
933 _CHR__________SNP_________BP___A $
934 ;
935 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
936 run;
NOTE: The infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear' is:
Filename=C:\Users\Adam Bress\Downloads\plink.assoc.linear,
RECFM=V,LRECL=32767,
File Size (bytes)=381808598,
Last Modified=08May2013:10:34:12,
Create Time=07May2013:11:41:15
NOTE: 4150092 records were read from the infile 'C:\Users\Adam Bress\Downloads\plink.assoc.linear'.
The minimum record length was 90.
The maximum record length was 91.
NOTE: The data set WORK.GWAS has 4150092 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 2.92 seconds
cpu time 2.55 seconds
4150092 rows created in WORK.GWAS from C:\Users\Adam Bress\Downloads\plink.assoc.linear.
I Think i might have to do something like this, but im specifying the variable locations wrong.
Data manhattan;
INFILE "C:\Users\Adam Bress\Downloads\plink.assoc.linear";
INPUT chr 4 snp 8-17 bp 23-28 a1 33 test 42-44 nmiss 51-53 beta 56-66 stat 71-77 p 84-91;
RUN;
I didn't have any problem reading the example file with:
proc import datafile="D:\plink.assoc.linear" out=gwas dbms=tab replace;
datarow=2;
getnames=yes;
run;
But I get 1047617 rows from the example file where your log shows 4150092
If you try to write your own input you'll need to add the delimiter='09'x at least. Your data isn't fixed columns. Test, Beta, Stat and P all have different lengths/numbers of significant digits.
Which OS are you running? I'm running SAS 9.;2.3 under Win7 and could read with the above syntax. We may be getting some sort of file conversion through the forum though.
The file you posted in the second ZIP file (is there a difference between the two?) is a normal TAB delimited DOS file (CR and LF as the line delimiters) with 9 columns.
The first line has the variable names.
The next 7 lines have data.
The remaining 1047610 lines are totally empty except for the tabs.
I can read it easily with PROC IMPORT.
172 options generic;
173 filename xx '~/plink.assoc.linear' termstr=CRLF ;
174 data _null_;
175 infile xx obs=9 ;
176 input;
177 list;
178 run;
NOTE: The infile XX is:
(system-specific pathname),
(system-specific file attributes)
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1 CHAR CHR.SNP.BP.A1.TEST.NMISS.BETA.STAT.P 36
ZONE 445054504504305455044455044540554505
NUMR 38293E0920911945349ED933925419341490
2 CHAR 1.rs28659788.713170.G.ADD.156.-3.549.-0.8538.0.3946 51
ZONE 307733333333033333304044403330232333023233330323333
NUMR 192328659788971317097914491569D3E5499D0E853890E3946
3 CHAR 1.rs28659788.713170.G.Age.156.-0.006926.-0.1199.0.9048 54
ZONE 307733333333033333304046603330232333333023233330323333
NUMR 192328659788971317097917591569D0E0069269D0E119990E9048
4 CHAR 1.rs28659788.713170.G.UV.156.-1.725.-1.369.0.1729 49
ZONE 3077333333330333333040550333023233302323330323333
NUMR 19232865978897131709795691569D1E7259D1E36990E1729
5 CHAR 1.rs28659788.713170.G.c1.156.10.05.2.434.0.01611 48
ZONE 307733333333033333304063033303323303233303233333
NUMR 1923286597889713170979319156910E0592E43490E01611
6 CHAR 1.rs3094315.742429.T.ADD.160.-0.3026.-0.3673.0.7139 51
ZONE 307733333330333333050444033302323333023233330323333
NUMR 19233094315974242994914491609D0E30269D0E367390E7139
7 CHAR 1.rs3094315.742429.T.Age.160.-0.02911.-0.5245.0.6007 52
ZONE 3077333333303333330504660333023233333023233330323333
NUMR 19233094315974242994917591609D0E029119D0E524590E6007
8 CHAR 1.rs3094315.742429.T.UV.160.-1.579.-1.244.0.2153 48
ZONE 307733333330333333050550333023233302323330323333
NUMR 1923309431597424299495691609D1E5799D1E24490E2153
9 CHAR ........ 8
ZONE 00000000
NUMR 99999999
NOTE: 9 records were read from the infile (system-specific pathname).
The minimum record length was 8.
The maximum record length was 54.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
179 options nogeneric;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.