Hello,
I am trying to convert a SAS dataset with nucleotide sequence to a fasta format. See sample FASTA formated file here:https://github.com/veg/hivtrace/blob/master/test/rsrc/TEST.FASTA
What I have is a table with two columns:
ID Seq
What I need is to create a new line for the corresponding Seq value as :
>ID1
Seq1
>ID2
Seq2
Any advise is much appreciated.![]()
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
| ID | Seq |
| >12121 | aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
| >343434 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >4545e | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >1111 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >2321 | cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
expected text file:
| >12121 |
| aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag |
| >343434 |
| cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >4545e |
| cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >1111 |
| cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
| >2321 |
| cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg |
Thank you
Add the necessary set statement to @Reeza 's code in order to read your SAS dataset.
You don't list the individual values, you list the variable names.
Run this, changing the filepath to be relevant as a starting point and see what is generated.
data _null_;
set sashelp.class;
file '/folders/myfolders/demo.txt' linesize=80 flowover;
put Name;
put Age;
run;
@HabAM wrote:
Thank you Reeza. The only issue is I have thousands of row values to individually list as suggested.
samle set
ID Seq >12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
expected text file:
>12121 aaatatgttgactcagattggttgtactttaaattttccaattagtcctattgaaactgtaccagtaaaattgaagccag >343434 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >4545e cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >1111 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg >2321 cctcaaatcactctttggcaacgacccttagttacagcaaaaataggggaacagctaatagaagccctattagacacagg
Thank you
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.