Questionnaire Overview

This page provides a comprehensive overview of all questions in the DENOFO questionnaire, organized in a table format showing the relationship between questions, their corresponding field names in the data model, and the expected data types.

Questionnaire Structure

The questionnaire is organized into sections covering different aspects of de novo gene annotation:

  1. Input Data (Section 1) - Questions about the source data used for gene detection

  2. Homology Filter (Section 2) - Questions about homology validation methods

  3. Non-coding Homologs (Section 3) - Questions about non-genic homologous sequences

  4. Evolutionary Information (Section 4) - Questions about selection pressure analysis

  5. Translational Evidence (Section 5) - Questions about translation verification

  6. Hyperlinks (Section 6) - Questions about study references

DENOFO Questionnaire Structure Overview

Figure 1: Overview of the DENOFO questionnaire structure showing the six main sections and their color coding used throughout this documentation.

Complete Question Reference

Input Data

Section 1: Input Data - Questions about the source data used for gene detection

Input Data Questions

Question

Field Name

Type

Did you detect your candidate de novo genes from a:

inputData

list[InputDataChoices]

Please choose the genome annotation method:

annotGenomeChoice

list[AnnotGenomeChoices]

Did you apply a TPM threshold used as a minimum level of transcript expression? (yes/no)

inputTranscriptome

bool

Please provide the TPM threshold used as a minimum level of transcript expression:

expressionLevel

float | None

Please indicate which transcripts were kept based on their overlap with the following genetic contexts:

transContextChoice

list[GeneticContextChoices]

Please provide your custom genetic context for your transcriptome data:

customGeneticContext

list[str] | None

Please choose which ORFs in the transcripts were selected:

transORFChoice

list[ORFChoices]

Please provide your custom ORF selection for your transcriptome data:

customORF

list[str] | None

Do you want to add additional information about the transcriptome (e.g. tissue, cell type, …)? (yes/no)

answerTranscriptomeInfo

bool

Please provide the information about the transcriptome (e.g. tissue, cell type, …):

transcriptomeInfo

str | None

Please provide your custom input data for de novo gene detection:

customInputData

str | None

Homology Filter

Section 2: Homology Filter - Questions about homology validation methods

Homology Filter Questions

Question

Field Name

Type

Did you validate absence of homology of your de novo genes? (yes/no)

homologyFilter

bool

Do you know in which taxonomic group your de novo gene candidates emerged? (yes/no)

phylogeneticTaxa

bool

Please choose the specificity for the taxonomic group where they emerged:

taxSpecificity

TaxSpecificityChoices

Please provide the taxonomic ID (name or number from NCBI Taxonomy DB) where they emerged:

taxID

str | int

Please choose your sequence type(s) used for homology filtering:

seqType

list[SeqTypeChoices]

Please provide your custom sequence type(s) used for homology filtering:

customSeqType

list[str] | None

Did you use structural similarity for homology filtering? (yes/no)

QStructuralSimilarity

bool

Please provide the structural similarity search software/method used for homology filtering:

structuralSimilarity

str | None

Please choose the metric used for homology filtering:

threshold

list[ThresholdChoices]

Please provide your custom metric for homology filtering:

customThreshold

list[str] | None

Please provide the threshold value for your homology filtering based on {metric}:

thresholdValue

list[float]

Please choose the database(s) used for homology filtering:

dataBase

list[HomologyDBChoices]

Please provide your custom database used for homology filtering:

customDB

list[str] | None

Non-coding Homologs

Section 3: Non-coding Homologs - Questions about non-genic homologous sequences

Non-coding Homologs Questions

Question

Field Name

Type

Did you detect non-genic homologous sequences in genomes from other taxonomic groups? (yes/no)

nonCodingHomologs

bool

Did you study conservation/mutations between de novo genes and homologous sequences?

enablingMutations

bool

Did you check for synteny between de novo genes and their homologous sequences? (yes/no)

synteny

bool

What was used to identify the syntenic region?:

anchors

list[AnchorChoices]

Please provide your custom anchor for synteny search:

customAnchor

list[str] | None

Did you use a specific software for the synteny search? (yes/no)

answerSoftwareSyntenySearch

bool

Please choose the software used for the synteny search:

softwareSyntenySearch

list[str] | None

Evolutionary Information

Section 4: Evolutionary Information - Questions about selection pressure analysis

Evolutionary Information Questions

Question

Field Name

Type

Did you study selection pressure of the de novo genes? (yes/no)

evolutionaryInformation

bool

Please provide the metric or method used to identify selection pressure:

selection

str | None

Translational Evidence

Section 5: Translational Evidence - Questions about translation verification

Translational Evidence Questions

Question

Field Name

Type

Did you verify the translation of the de novo genes? (yes/no)

translationalEvidence

bool

Please choose the method used as evidence for translation:

translationEvidence

list[TranslationEvidenceChoices] | None

Please provide your custom method used as evidence for translation:

customTranslationEvidence

list[str] | None

Data Types Reference

Choice Enumerations:

  • InputDataChoices: annotated genome, transcriptome, custom choice

  • AnnotGenomeChoices: ab initio approach, homology-based approach, unknown

  • GeneticContextChoices: intergenic, antisense, intronic, overlap gene, custom choice

  • ORFChoices: no ORF, all ORF, highest Kozac, longest ORF, start first ORF, long 5` 3` ORF, custom choice

  • TaxSpecificityChoices: tissue/condition-specific, species-specific, lineage-specific

  • SeqTypeChoices: protein sequences, DNA, 6-frame translation, RNA, ncRNAs, transposable elements, custom choice

  • ThresholdChoices: e-value, coverage [%], custom choice

  • HomologyDBChoices: NCBI NR, RefSeq, UniProtKB/TrEMBL, UniProtKB/Swiss-Prot, ENA (by EMBL-EBI), Ensembl, InterPro, custom choice

  • AnchorChoices: gene anchors, genome alignment, custom choice

  • TranslationEvidenceChoices: mass spectrometry, ribosome profiling, periodicity, custom choice

Basic Types:

  • str: String/text value

  • int: Integer number

  • float: Decimal number

  • bool: Boolean (True/False)

  • list[Type]: List containing elements of the specified type

  • Type | None: Optional field that can be the specified type or None/empty

Notes

  • Questions are presented in logical order following the questionnaire flow

  • Some questions are conditional and may not appear based on previous answers

  • Fields marked with “| None” are optional and may be empty

  • Custom choice options allow users to provide their own values when predefined choices don’t apply

  • The questionnaire uses a question stack system that allows users to navigate back to previous questions