Bob's curated genes

From DictyWiki

Jump to: navigation, search


Color Code

Red: unreviewed curated gene

Purple: curated gene is under review, but not complete

Orange: curated gene is under review by P or P but would like to have the other P's opinion

No color tag (blue) when complete

Genes skipped during fast track curation

Category 5A genes




Not Curated. Can't figure out the correct gene model.








Curated. Altered 3' boundary of exon 3.


Curated. Removed 3' exon.




Not curated. Intron clearly too short and all matches support read through of the current intron. However, geneid does not predict a model which agrees with matches, and I cannot yet figure out the correct gene model here.


Not curated. May have additional 5' exon, but evidence is not conclusive.


Not curated. Not sure here: RNA transcripts mostly support the model, but several Dd and Dp matches are much longer.


Curated. Modified 3' boundary of exon 2.


Not curated. Not sure here. Both Dp and Pp orthologs support a 3' extension of this gene model; RNA transcripts indicate a possible 3' exon, but can't identify a possible correct model.


Curated. Supported by RNA transcripts; weak similarity to Polysphondylium pallidum protein


Not curated. This appears to need to be merged with the downstream lvsD gene. Matches a Dpur ortholog (DPU_G0059858) which contains ESTs which appear to support this merger as well. However, lvsD has been comprehensively annotated, so check with Pascale and Petra first before doing this merge.


Not curated. This matches portions of much larger Dd and Dp proteins; one of which is annotated as a polyketide synthase. The genomic region flanking this gene contains numerous gene models with no apparent support. This appears to be part of larger polyketide synthase gene model which will take some investigation to identify. Another skipped gene, DDB_G0271324 is included in this region.


Not curated. There is not really good support for this gene model: the RNA seq data barely extends into one of the predicted exons; similar to a D pur protein, but the Dp protein is much longer. Yet geneid repredicts the same gene. Review.


Curated. Supported by RNA transcripts; conserved in D. discoideum.


Curated. Supported by RNA transcript. Added conflicting evidence here as both the Dpur and P. pallidum homologs are longer in the 3' region.


Curated. Only supported by transcript.


Not curated. There is not really support for the 3 small exons on the 3' end of the model, but unsure of the correct gene model.


Not curated. ESTs indicate that this gene should extend 3' through the downstream gene DDB_G0268770 (which should be deleted?); but cannot figure out the correct gene model. One EST supports an intron, the other does not.

Category 6 genes


Not curated. ESTs, geneid reprediction, and comparison to D. pur ortholog each support different splice junctions in the 5' end of this gene model. 25 min.


Curated. No review needed. 40 min.


Curated. No review needed. 15 min.


Curated. This gene either needs to be merged with the downstream gene DDB_G0274991, or is a pseudogene. Cannot easily determine the splice boundaries of a potential merged gene. Also contains an early stop codon, and is similar to another family of genes in Dicty. So this appears to meet the criteria for a pseudogene. 15 min.

  1. Has now been curated and merged with DDB_G0274991. Exon/intron boundaries adjusted to create the new model and an artificial 3 nt gap has been introduced to remove a stop codon.


Curated, merged with DDB_G02749357. No review needed. Approx. 90 min.


Not curated; skipped for now. The gene model looks OK, but overlaps with a flanking retrotransposon gene. 10 min.


Curated, no review needed. 25 min.


Not curated: Cannot figure out 5' end here. EST supports an initiator downstream of current model, but note that only current model contains the XYPPX repeat motif, so this may be correct model. Revisit. 35 min.


Curated, no review needed. 20 min.


Not curated: puzzled over 3' end of this gene. ESTs appear to predict the 4th exon currently predicted by sequencing center model, but extend into the intron and don't really support the model. Also, geneid reprediction of the local region predicts a slightly longer 3rd exon ending in a termination codon (no 4th exon). Protein from this prediction matches pfam, human ortholog, etc. equally as well as protein from sequencing center model. So is this a case of a splice variant, a shorter protein, or is it correct as is? 40 min.


Curated, no review needed. 15 min.


Curated, no review needed. An EST does not align well with splice boundaries, but this model is supported by seq. similarity to D. pur and P. pal. 20 min.


Not curated. Current gene model here is incorrect-intron is too short. Also ESTs support merging this and the upstream gene DDB_G0272700, as do alignment of both of the protein sequences from these two genes with EFA78634.1-a hypothetical protein from Polysphondylium. But I cannot figure out the correct gene model for the proposed new gene. Return to this with fresh eyes. 40 minutes.


Curated, no review needed. 25 min.


Curated, no review needed. 20 min.


Curated, no review needed. 20 min.


Curated, no review needed. 25 min.


Curated, no review needed. Changed gene model based on EST evidence. 40 min.


Curated, no review needed. 40 min.


Not curated: No ESTs or solexa reads, sequence matches other dicty and Dp proteins, but all matches are much larger, so this may be a fragment of a larger gene. May be a fragment that needs to be merged with another gene. 15 min.


Curated, no review needed. 25 min.


Curated, no review needed. 25 min.


Curated, no review needed. No ESTs (they support flanking gene and extend a little into 3' region here. But gene model is supported by sequence similarity. 20 min.


Now curated. Hard to ignore the EST extending completely through the intron. But this still looks like a valid gene model: protein is a good match to Dpur ortholog, protein matches DUF23430 family at Interpro search; running at geneid predicts the same model; translation through the intron introduces stop codons; the ESTs match other dicty coding sequences at 100%-although longest match is to this gene. What's best here-curate as is with conflicting evidence? Thanks.

Pascale: Hi Bob: A single EST is not considered conflicting evidence. See (you should know all this fairly well by now - I suggest you read this again. )

The reason why it's not a conflict is that in the experimental procedure for making ESTs, there may be a little bit of genomic contamination, so we ignore if there is one EST without intron(s).

Thanks, agreed, have curated with support from sequence similarity, from the D. purpureum protein. Bob. 60 min.


Curated, no review needed. Similar to DDB_G0269646 described below. ESTs appear to support a different splice boundary at 3'end of the second exon, but changing in Apollo leads to premature stop codon. 30 min.


Curated, no review needed. 20 min.


Curated, no review needed. ESTs appear to support a different splice boundary at beginning of the third exon, but changing in Apollo leads to premature stop codon. 45 min.


Curated, no review needed. No EST or solexa support, but weak similarity to Dd, Dp and Pp proteins. 25 min.


Curated, no review needed. A 3' UTR made me hesitate for fast track curation, but protein alignments support the current gene model. 25 min.


Curated, no review needed. An EST extends into the intron, but the region is similar to sequence of the flanking exon. 30 min.


Curated, no review needed. One EST doesn't support the model, but all other evidence supports this model. 35 min.


Curated, no review needed. 25 min.


Not curated: Just spent almost an hour, but am still not sure! This may need to be merged with the gene upstream (DDB_G0268682) but I am not completely convinced. Will probably need another opinion here. 50 min.


Curated, no review needed. 30 min.


Curated, no review needed. One EST doesn't support the model, but all other ESTs, sequence similarity and solexa reads do support it. 25 min.


Curated, no review needed. 45 min.


Not curated: looks too complicated to solve easily; new intron is probably needed; ESTs also extend well into current intron so boundaries also need to be changed. 20 min.


Curated, but no review needed. 25 min.


Curated, but no review needed (EST extends into intron, but is matched by flanking exon sequence). 30 min


Curated, but no review needed. 30 min.


  1. One EST differs from chosen gene model and extends upstream well into the first exon. Since most ESTs support the predicted model, I have curated this model and chosen conflicting evidence. Could this be a sequencing error-maybe the bp upstream of the EST is a g instead of t which would make this a consensus splice site? (atattttag instead of atattttat)
  2. Petra reviews and gives OK for gene model-recommends removing 'conflicting evidence'. OK. Approx. 60 min.


  1. Curated. I think the gene model is correct as the protein matches Dpur ortholog and belongs to a dicty ugt family. 30 min.

End of genes skipped during fast track curation

Personal tools