Hello,
I am trying to convert a SAS dataset with nucleotide sequence to a fasta format. See sample FASTA formated file here:https://github.com/veg/hivtrace/blob/master/test/rsrc/TEST.FASTA
What I have is a table with two columns:
ID Seq
What I need is to create a new line for the corresponding Seq value as :
>ID1
Seq1
>ID2
Seq2
Any advise is much appreciated.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID | Seq |
>12121 | aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
expected text file:
>12121 |
aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
>343434 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>4545e |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>1111 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
>2321 |
cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
Thank you
Add the necessary set statement to @Reeza 's code in order to read your SAS dataset.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.