How to use Genbank to compare mushroom DNA and build phylogenetic trees

Discussion in 'ADVANCED MYCOLOGY' started by Alan Rockefeller, Apr 30, 2012.

  1. Alan Rockefeller

    Alan Rockefeller Moderator Moderator Expert Identifier

    Joined:
    Apr 30, 2012
    Messages:
    471
    If you follow these instructions you will be analyzing mushroom DNA in minutes. I get the sequence to compare by going to http://www.ncbi.nlm.nih.gov/sites/gquery and searching for the species, then clicking nucleotide, then click on a species, this is listed for Psilocybe subaeruginascens:

    http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=23574923

    It says:

    1 aaggatcatt attgaataac tttggcgtgg ttgtagctgg ccctctcggg ggcatgtgct
    61 cgcctgtcat ctttatatct ccacctgtgc accttttgta gacgtctttg ttggaagctg
    121 aataggagag aatgggtgct agtcactctt tctcgagttg aaggctttct caaggtcgct
    181 ctatgttttc atatacccca agtatgtaac agaatgtatc tatatggcct tgtgcctata
    241 aaactatata caactttcag caacggatct cttggctctc


    I take out the numbers so I have:


    aaggatcatt attgaataac tttggcgtgg ttgtagctgg ccctctcggg ggcatgtgct
    cgcctgtcat ctttatatct ccacctgtgc accttttgta gacgtctttg ttggaagctg
    aataggagag aatgggtgct agtcactctt tctcgagttg aaggctttct caaggtcgct
    ctatgttttc atatacccca agtatgtaac agaatgtatc tatatggcct tgtgcctata
    aaactatata caactttcag caacggatct cttggctctc

    The paste that into the box labeled "Enter accession number, gi, or FASTA sequence" after going to the main blast.cgi and clicking
    nucleotide blast.

    Then go to the Database drop-down and select "Nucleotide collection (nr/nt)" and click BLAST.

    It gives you a cool graphic that shows the distribution of the blast hits with mouseovers to see the species, and then lists your matches, then shows the actual alignments below.

    I like to sort the results by total score, and there is a lot of really interesting information on the results page. It says the only 100% match is Psilocybe subaeruginascens, but that Psilocybe cubensis and Panaeolus sphinctrinus are 87% matches!

    They have DNA from a ton of interesting species from interesting articles that I have never heard of. Psilocybe quebecensis is in there due to an article called "Forensic analysis of hallucinogenic fungi: a DNA-based approach". Psilocybe subaeruginascens in in there from "Phylogenetic relationship of psychoactive fungi based on the rRNA gene for a large subunit and their identification using the TaqMan assay" and from "[Discrimination of psychoactive fungi (commonly called 'magic mushrooms') based on the DNA sequence of the internal transcribed spacer region]". Some of these look like they would be really interesting to try to get ahold of via interlibrary loan.

    It is unclear how much of a difference is required to define a new species. In my opinion, a consistent 5 base pair difference is enough.

    Rough instructions for making phylogenetic trees with mafft using only a web browser:


    1) Get the DNA from Genbank, http://www.ncbi.nlm.nih.gov/nuccore?term=Panaeolus%20large%20subunit
    At the bottom right click Send to:, chose file with FASTA format
    2) Go to online mafft, start from the FASTA file on your hard drive, open all plots, make sure the lines are red.
    If any graphs have blue lines remove the sequence from the file, reverse them with rev_comp, then put them back in. http://www.bioinformatics.org/sms/rev_comp.html
    3) If sequences do not allign well put the best/longest one at the top
    4) Click the phylogenetic tree button

    Better trees could be made by trimming off unaligned parts and manually verifying that the alignments are sane.

    Here is a tree I made for Rhodocollybia

    http://plantobserver.org/rhodocollybia.pdf

    I made this rough draft phylogenetic tree of Panaeolus using mega5 and information from Genbank.

    http://plantobserver.org/Panaeolus.phylogenetic.tree.pdf

    FASTA file:

    http://plantobserver.org/Panaeolus.fasta

    From this I may have learned:

    * Panaeolus papilionaceus and P. sphinctrinus are likely the same species
    * Panaeolus acuminatus is close to P. olivaceus
    * Panaeolus cinctulus is close to Panaeolina foenisecii and the split from Panaeolus to Panaeolina is not supported by genetic evidence as Panaeolina does not seem to form a separate clade outside of Panaeolus.
    * Panaeolus semiovatus is only distantly related to the rest of Panaeolus
    * Copelandia forms a distinct clade within Panaeolus, and maybe should be a subgenus or section.
     
    jigalow likes this.
  2. eLShaMukO

    eLShaMukO Moderator Moderator Mushroom Doctor

    Joined:
    Jun 8, 2011
    Messages:
    4,444
    Gender:
    Male
    Location:
    Mexico
    great thread

    :press: it does look like fun


    yea we have this one on our PDF thread definitely interesting material .

    http://myco-tek.org/showthread.php?101-Ultimate-Mush-Cult-PDF-Library

    :hmmm:
     
  3. Alan Rockefeller

    Alan Rockefeller Moderator Moderator Expert Identifier

    Joined:
    Apr 30, 2012
    Messages:
    471
    It is fun. Give it a try.

    Usually people think of DNA analysis as a black art that they will never be able to use, since it would cost around $100,000 to get a full setup at home. But you don't need to sequence your collections in order to use DNA tools to answer questions. For example, the other day people were saying that Marasmius oreades doesn't look much like any other Marasmius and perhaps it should be in a new genus. I checked Genbank for a Marasmius oreades sequence and within seconds had a list of everything that it is closely related to. M. oreades does belong in Marasmius.

    Thanks. I read that article and I actually didn't like it. They are doing DNA sequencing to enforce psilocybin laws, which makes little sense to me because if they are trying to see if someone has psilocybin mushrooms a more reliable way would be to test the mushrooms chemically. I also see big problems with their phylogenetic tree - It has Psilocybe and Panaeolus species on the same branches, and they clearly shouldn't be.
     
  4. eLShaMukO

    eLShaMukO Moderator Moderator Mushroom Doctor

    Joined:
    Jun 8, 2011
    Messages:
    4,444
    Gender:
    Male
    Location:
    Mexico
    they claim specimens seized are hard to identify or analyze and want to automate IDs. it is a weird and expensive approach but interesting

    what would make them do that and not notice ? its not that old its from 2002 i think.


    im trying with P. barrerae but can seem to make it work lol
     
  5. Alan Rockefeller

    Alan Rockefeller Moderator Moderator Expert Identifier

    Joined:
    Apr 30, 2012
    Messages:
    471
    My guess is that the tree doesn't include enough species and the criteria they used to build it didn't find enough difference to put the two genera on separate branches. I would have adjusted the algorithm until it made a tree that made more sense.

    There are no sequences for Psilocybe barrerae in Genbank. It is very likely that the ladies in Guadalajara have sequenced it but have not yet added their sequences to Genbank yet. It is a very rare species, recently described from Veracruz. I have never seen it but I will check out the type location in 6 weeks or so and see if I can get some pics of it.

    Try a more common species, or just put in Psilocybe and see what they have. You want the nucleotide collection.
     
  6. eLShaMukO

    eLShaMukO Moderator Moderator Mushroom Doctor

    Joined:
    Jun 8, 2011
    Messages:
    4,444
    Gender:
    Male
    Location:
    Mexico
    i know very little about it but i know very well its habits , im close to getting some fresh specimens

    in the PDF thread theres a document calle characterization and cultivatio of psilocybe barrerae
    im using that as reference
     
    Last edited: May 1, 2012
  7. nomendubium

    nomendubium scraping by, since '97 Expert Identifier

    Joined:
    Jan 18, 2012
    Messages:
    2,949
    Gender:
    Male
    Location:
    Ohio
    awesome! Thanks Alan, I'm trying it now. I'm having a hard time interpreting the data. I searched for P. ovoideocystidiata, but there were no results, so I searched on the first thing that came up on the list(apears to be a species from china?) and here's what I got; http://blast.ncbi.nlm.nih.gov/Blast.cgi#113171087 Does that mean the mushroom I blasted is 99% DNA the same as P. stunzii? I'm confused on how to interpret the data :shrug2:
     
  8. Professor PinHead

    Professor PinHead Lost in the Tek.... Administrator Mushroom Doctor Cannabis Doctor Supporter

    Joined:
    May 27, 2011
    Messages:
    9,167
    Location:
    A Rhizomorphic Space
    It's going to take me a minute to get the hang of this but great post Alan! Thanks for sharing it with us,....

    Now off to the site to try and sort it out....

    I had no idea that recourse was available. Now I can be even more of a nerd! :woot1:
     
  9. RiverDweller

    RiverDweller Moderator Moderator Expert Identifier

    Joined:
    Mar 7, 2012
    Messages:
    1,832
    Gender:
    Female
    Location:
    Oregon
    word on the nerd man, that site is fun to play with!
     
  10. OoBYCoO

    OoBYCoO Super Moderator Moderator Mushroom Doctor

    Joined:
    May 30, 2011
    Messages:
    5,002
    *mindblown* umm So I had to rate this thread, pin it, and archive it!

    EDIT: Forgot to add :clap2:

    [​IMG]
     
    Last edited: May 4, 2012
  11. Alan Rockefeller

    Alan Rockefeller Moderator Moderator Expert Identifier

    Joined:
    Apr 30, 2012
    Messages:
    471
    Here is a Psilocybe ovoideocystidiata sequence from Ohio, Mr. Mushrooms collection.


    GTAAAAGTCGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGGATCATTATTGAATAACTTTGGCGTGGTTGTAGCTGGCCCTCTCGGGGGCATGTGCTCGCCYGTCATCTTTATATCTCCACCTGTGCACCTTTTGTAGACGTCTTTGTTGGAAGCTGRATAGGAGAGAATGGGTGCTAGTCACTCTTTCTCGAGTTGAAGGCTTTCTCAAGGTCGCTCTATGTTTTCATATACCCCAAGTATGTAACAGAATGTATCTATATGGCCTTGTGCCTATAAAACTATATACAACTTTCAGCAACGGATCTCTTGGCTCTCGCATCGATGAAGAACGCAGCGAAATGCGATAAGTAATGTGAATTGCAGAATTCAGTGAATCATCGAATCTTTGAACGCACCTTGCGCTCCTTGGTATTCCGAGGAGCATGCCTGTTTGAGTGTCATTAAATTCTCAACCTTACCAGCTTTTGTTAGCTTGTGTAATGGCTTGGACTTGGGGGTTTTTTGCCGGCTTCTAACAAAGTCAGCTCCCCTTAAATGCATTAGCCGGCTGCCCGCTGTGGACCGTCTATTGGTGTGATAATTATCTACGCCGTGGATGTCTGCTATCAATGGGTTTTTAAAGCTGCTTCTAACCGTCTGTTCATTCGGACAATACAATGACAATTTGACCTCAAATCAGGTAGGACTACCCGCTGAACTTAAGCATATCAATAAGCGGAGGAAAAGAAACTAACAAGGATTCCCCTAGTAACTGCGAGTGAAGCGGGAAAAGCTCAAATTTAAAATCTGGCGGTCTCGGCCGTCCGAG


    The first BLAST hit is HM035074, which is labeled as a Psilocybe cubensis 28S sequence. 28S is a different locus than ITS, so this is probably mislabeled. It is likely to be a cubensis ITS sequence. The query coverage is 95%, which means that they overlap 95%. Max ident is 93%, which means that 7% of the base pairs are different and 93% are the same.

    Since the top 3 BLAST hits are P. cubensis, it is fair to say that there is nothing closer in Genbank to P. ovoideocystidiata than P. cubensis. The next 2 are P. cyanescens and P. azurescens, and the next one is a Galerina! It would be interesting to scope that Galerina and see if it is really a Psilocybe.
     
  12. RiverDweller

    RiverDweller Moderator Moderator Expert Identifier

    Joined:
    Mar 7, 2012
    Messages:
    1,832
    Gender:
    Female
    Location:
    Oregon
    I [​IMG] Mr. Mushrooms, miss him very much.
     
  13. Professor PinHead

    Professor PinHead Lost in the Tek.... Administrator Mushroom Doctor Cannabis Doctor Supporter

    Joined:
    May 27, 2011
    Messages:
    9,167
    Location:
    A Rhizomorphic Space
    I still have yet to have gotten a chance to play around with the genbank yet.
    To be honest the taxonomy browser in and of itself is cool as hell, lol.

    Galerina steglichii would definitly be a very very cool addition to anyones collection! :ugh:

    A lot of Galerinas produce amatoxins though.... That wouldn't be a very cool mix up :(
     
  14. ManicMongrel

    ManicMongrel Active Member

    Joined:
    Mar 3, 2012
    Messages:
    366
    Location:
    Northern Europe
    13% sounds like a pretty large difference.

    Humans and chimps are 4% different by modern methods, and Neanderthals are about 0,3%. Mammalian and fungal genetics have fundamental different mechanisms. Though taking that into account 13% still sounds like a lot, I get the impression they must have split apart a very long time ago. I dont have any formal education on genetics do I cant make any claims.
     
  15. Alan Rockefeller

    Alan Rockefeller Moderator Moderator Expert Identifier

    Joined:
    Apr 30, 2012
    Messages:
    471

    Keep in mind that we are just looking at the ITS1/ITS2 region, it is one thousand base pairs out of 40 million. ITS stands for internal transcribed spacer, and it is a part of the DNA which does absolutely nothing. It is therefore free to mutate without causing the organism any harm.

    The humans and chimps comparison is talking about the whole genome.

    It would be interesting to process genbank data into a video game or some kind of graphic representation of how mushrooms are related to each other.
     
  16. ManicMongrel

    ManicMongrel Active Member

    Joined:
    Mar 3, 2012
    Messages:
    366
    Location:
    Northern Europe
    Ah ok, missed that part. I suppose comparing the inactive regions of dna have limited use.
     
  17. Audio Midi Setup

    Audio Midi Setup New Member

    Joined:
    Sep 6, 2012
    Messages:
    11
    Location:
    WA
    This thread is full of win.

    I have questions.

    You mention that we're just looking at the ITS1 and ITS2 regions. What data do these regions hold? What do the other regions (the other 39.99 million) tell us if they are sequenced? Why do we use just these two regions?
     
    Last edited: Sep 10, 2012
  18. TheDeathryder

    TheDeathryder Active Member

    Joined:
    Jul 9, 2012
    Messages:
    592
    Gender:
    Male
    Location:
    Texas
    Awesome post!

    After making the tree though, how do you figure out what other species are closely related to the one you are studying? Is there a tool I'm overlooking, or do you just start picking species and compare them manually?
     
  19. Terry M

    Terry M Member Mushroom Doctor

    Joined:
    Jun 2, 2011
    Messages:
    282
    Gender:
    Male
    Location:
    The State of Rhode Island and Providence Plantatio
    You know what would be a great resource? A list of labs that do custom genetic sequencing, and their prices. I once looked this up, and found it to be remarkably low cost. It was under $100, if I remember correctly. So if you think you've discovered a new species and it's worth the money to you, you can find out by sequencing and using the public software tools!

    There are separate operations, all of which are relatively cheap: Break up the cells, separate out the DNA/RNA, spin this down into a pellet, make primers for the regions you want, and do the sequencing. For the ITS1 and ITS2 ribosomal RNA regions, the lab's probably already got primers, and doesn't have to create new ones for you.
     
  20. nomendubium

    nomendubium scraping by, since '97 Expert Identifier

    Joined:
    Jan 18, 2012
    Messages:
    2,949
    Gender:
    Male
    Location:
    Ohio
    I have been playing with this quite a bit. Certainly I'm no expert, but I believe
    A. the usefulness of the ITS region 1 and 2 IS that they are regions that do not affect the phylogeny of the organism. If other regions were effected, the organism woud be a "mutant" of sorts, IE something would have been mutated and that may cause the organism to either be supiorior and surpase ones without the mutation or inferior and there for not survive or not flourish as well as others. Both of which become a problem in that the rate of mutation would either speed up or slow down because of it, IE if it were a positive mutation, that organism would spread it's genetics faster/more than the unmutated variety, throwing the rate off.
    b. when you hit blast, it tells you what species are highly similar, right there on the list, so I'm not sure what you are asking deathryder.
    Yesterday I wondered for my own purposes the similarity between collections of Gymnopilus liquiritiae. I found 6 collections, 3 from Japan and 3 from Canada. I compared them and found 2 of the 3 from Canada were identical, the third was identical to Gymnopilus bellulus (as I recall, indicating a misidentification) and the three from japan were identical, but between the 2 collections (and of the 2 from Canada, discounting the misidentified one) they were only 84% identical, concluding (as far as I can tell) they are obviously different organisms. :) Of course I'm not the leading expert, or even very good at interpreting the data.