About asiddiqui

asiddiqui · ‎04-03-2014

You can also do it in SAS if it makes it easier for you. Proc sort data= a; by Id; run; Proc sort data= b; by ID; run; data abc; merge a(in=a) b; by ID; if a=1; run;

asiddiqui · ‎08-21-2012

Thank you PGStats and MikeZdeb, your codes works perfectly as intended with Mike's and my dummy input file , but when I run it on my actual file (image below) it's not reading all the sequences using proc print. I was not able to figure out why, Then It struck me maybe its something with my "proc print" output settings, so I used ODS to put in a pdf file, this time it read all my sequences but had spaces between the different sequences...hmm, Used ODS to html and boom all looks good (but cant explain why). THANKYOU PGStats and MikeZdeb for your help and valuable time. Love this forum. Input file Incorrect output with spaces with ods pdf

asiddiqui · ‎08-20-2012

Thnx y'all, both responses works but the sequence is reading only upto 107 characters and not beyond. My input file has 1302 sequence char.

asiddiqui · ‎08-20-2012

I have an input file with a sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line (defline) is distinguished from the sequence data by a greater-than (">") symbol at the beginning, shorter than 80 characters in length. The data is divided into 50 character set each, in multiples lines extending upto 1400 characters. >gi|5524211 gb AAD44166.1 cytochrome b LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFW GATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVA LAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLLL LALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGV LALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQ PVEYPYTIIGQMASILYFSIILAFLPIAGXIENY My question: When I read the input file into a dataset, I created two columns, "Desc" and "Sequence". I need my dataset to have one Desc row and one Sequence row, but the sequence is getting divided up into multiple row like as follows. Looking for help either cleaning the LFCR as I create the dataset or conc the rows after the dataset is created. PLEASE HELP Obs Desc Sequence ------------------------------------------------------------------------------------------------------------------------- 1 gi|5524211 gb AAD44166.1 cytochrome b LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFW 2 GATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFAFHFILPFTMVA 3 LAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLLL 4 LALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGV 5 LALFLSIVILGLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQ 6 PVEYPYTIIGQMASILYFSIILAFLPIAGXIENY

Online Status	Offline
Date Last Visited	‎09-01-2015 07:11 AM

Re: 'If A and not B' equivalent in proc sql

Re: reading fasta file into dataset

Re: reading fasta file into dataset

reading fasta file into dataset

Re: 'If A and not B' equivalent in proc sql

Re: reading fasta file into dataset

Re: reading fasta file into dataset

reading fasta file into dataset