We support boolean queries, use +,-,<,>,~,* to alter the weighting of terms
THIS RESOURCE IS NO LONGER IN SERVICE, documented on September 06, 2013. The fully federated database provides information on proteins identified in inner ear structures and fluids in health and disease. in addition, protein maps derived from 2d electrophoresis/mass spectrometry are displayed.
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 3,2023. Collection of miRNA target interactions. Manually curated collection of experimentally supported microRNA targets in several animal species of central scientific interest, plants and viruses. Database is functionally linked to several other relevant and useful databases such as Ensembl, Hugo, UCSC and SwissProt.
Preprocessed versions of the ADHD-200 Global Competition data including both preprocessed versions of structural and functional datasets previously made available by the ADHD-200 consortium, as well as initial standard subject-level analyses. The ADHD-200 Sample is pleased to announce the unrestricted public release of 776 resting-state fMRI and anatomical datasets aggregated across 8 independent imaging sites, 491 of which were obtained from typically developing individuals and 285 in children and adolescents with ADHD (ages: 7-21 years old). Accompanying phenotypic information includes: diagnostic status, dimensional ADHD symptom measures, age, sex, intelligence quotient (IQ) and lifetime medication status. Preliminary quality control assessments (usable vs. questionable) based upon visual timeseries inspection are included for all resting state fMRI scans. In accordance with HIPAA guidelines and 1000 Functional Connectomes Project protocols, all datasets are anonymous, with no protected health information included. They hope this release will open collaborative possibilities and contributions from researchers not traditionally addressing brain data so for those whose specialties lay outside of MRI and fMRI data processing, the competition is now one step easier to join. The preprocessed data is being made freely available through efforts of The Neuro Bureau as well as the ADHD-200 consortium. They ask that you acknowledge both of these organizations in any publications (conference, journal, etc.) that make use of this data. None of the preprocessing would be possible without the freely available imaging analysis packages, so please also acknowledge the relevant packages and resources as well as any other specific release related acknowledgements. You must be logged into NITRC to download the ADHD-200 datasets, http://www.nitrc.org/projects/neurobureau
Software that provides quality control and quality assessment tools for flow cytometry data.
Software application that provides cleaning of FASTQ/A formatted large DNA sequence files containing multiple short-reads sequences provided by Next Generation Sequencing platforms.
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 9,2022. A web crawler that can intelligently acquire social media content on the Internet to meet the specific online data source acquisition needs of cancer researchers.
System for automatically extracting, analzying, visualizing and integrating molecular pathway data from the research literature. System focuses on interactions between molecular substances and actions, providing a graphical consensus view on the collected information. GeneWays is designed as open platform, allowing researchers to query, review and critique integrated information.
THIS RESOURCE IS NO LONGER IN SERVCE, documented September 2, 2016. Beta software used to align and browse a genome.
Mission: Dynamically evolve sequencing, finishing, annotation and analysis processes, exploit new technologies, and develop expertise to deliver high quality and high throughput sequence-based microbial science by listening to and responding to DOE Users and scientific community needs. GOALS 1. Expand product catalog and increase sample throughput while maintaining highest quality The MGP has been expanding its product catalog beyond a finished microbial genome and has projected to significantly up ramp throughput for the majority of its current products namely Draft Genomes, Single Cell Genomes, Quick Draft Genomes, Resequencing projects and RNAseq Project. This projected increase in microbial genomes is going hand-in-hand with and has been stimulated by new high throughput technologies and capabilities (de novo microbial Illumina assemblies, single cell genomics, Genologic sample tracking). The increased throughput will support the user community as well as JGI scientists by enabling DOE-relevant science at a grander scale. As the Program aims to generate hundreds of microbial genomes per year, our goal is to scale our production efficiency and maintain our trademark quality to best support our science mission. 2. Expand sequence space One of the ongoing missions of the MGP is to expand the coverage of the phylogenomic sequence space by generating reference genome datasets from highly diverse braches in bacterial and archaeal tree of life. The value of such effort includes the generation of phylogenetic anchors for metagenomic datasets, the improvement of annotation, an increased insight into phylogenetic distribution of functions, the discovery of novel genes, protein families, pathways and a better understanding on evolutionary diversication. 3. Make Single Cell Genomes a robust User product As the vast majority of microbes are uncultured to date, single cell genomics will be a crucial component of the MGP over the next several years to drive not only JGI science but also User community proposed single cell research. Going hand-in-hand are R&D efforts in selective single cell isolations, testing the effects of fixation of single cell sequencing, as well as single cell transcriptomics. 4. Sequence Pangenomes Combining similar genomes together creating pangenomes will allow more compact genome sequence storage and visualization and expedite analysis and annotation. Moreover, the pangenome as a representation of the whole group of organisms may be more representative of a given species within the environment. The MGP thus thrives to enable the sequencing and analysis of pangenomes. Current technology allows the sequencing of one organism strain at a time. Assuming that for most cases, several dozen strains may need to be sequenced in order to generate a more accurate pangenome for every microbial species, it becomes evident that the cost for doing so may be prohibitively high. Our goal here will be to explore new approaches and technologies for generating these pangenomes at a very low cost and analogous to what is the cost today for a single strain. 5. Expand and improve microbial annotation using transcriptomic data To improve annotation of gene structure, establish accurate transcription level and timing, provide information on gene regulation and generate information for expanding understanding of systems biology, the MGP thieves to generate transcriptomics data for larger sets of Bacteria and/or Archaea. This will enable the identification of novel regulator RNAs, as well as facilitate the understanding of uncharacterized protein families. 6. Maintain and evolve a top quality data management system To enable state of the art and world class comparative analysis of internal and external scientific projects, the JGI data integration and visualization management system for comparative analysis of microbial genomes, namely IMG, needs to be maintained and continuously evolved. The system needs to be able to support and integrate all data generated by JGI (WGS, reseq, RNAseq, -other omics data), as well as by the user community, enabling annotation and manual curation of the annotation, comparative analysis, gene-centric and pathway centric analyzes. The system should also facilitate the interation of associated metadata, enable data sharing and distribution, as well as automated data GenBank submissions. Lastly, the system needs to have the ability to scale enabling the annotation of thousands of genomes per year. 7. Drive Flagship projects To stay at the forefront of microbial genomic research, be recognized as such and enable the development new methods and tools, the MGP aims to drive DOE mission relevant flagship projects. Novel tools and methods developed will ultimately serve the user community if proven useful and implemented as part of a larger pipeline. MGP flagship projects are the GEBA and GEBA uncultured projects, as well as the GEBA-RNB, the proposed Microbial Earth and the Microbial Dark Matter Projects.
Software program that extracts causative variants in familial and sporadic genetic diseases. The algorithm takes into account predicted variants (SNPs and indels) in affected individuals or tumor samples and utilizes the row (BAM) data to robustly estimate the conditional probability of segregation in a family, as well as the probability of it being de novo or somatic. In familial cases, various modes of inheritance are considered: X-linked, autosomal dominant, and recessive (homozygosity or compound heterozygosity). Moreover, it integrates phenotypes and genotypes, and employs Annovar to produce additional information as allelic frequencies in general population and damaging scores.
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 9, 2022. System that retrieves relevant UniProt IDs from BioThesaurus entries using a soft string matching algorithm.
A Lady Scientist chronicles the author's journey through grad school and navigating the so-called Two Body Problem. The author, Amanda (at) Lady Scientist, is a recent Ph.D. graduate in biochemistry and molecular biology.
Software Python package for detection, alignment and reporting of recombination events in Next-Generation Sequencing data. Detects and reports recombination or fusion events in virus genomes using deep sequencing datasets.
A web interface to the ANNOVAR software, a tool to annotate functional consequences of genetic variation from high-throughput sequencing data, to help biologists without bioinformatics skills to easily submit a list of mutations (even whole-genome variants calls) to the web server, select the desired annotation categories, and receive functional annotation back by emails. Given a list of single nucleotide variants (SNVs) and insertions / deletions in VCF or ANNOVAR input format, wANNOVAR annotates their functional effects on genes (such as amino acid changes for non-synonymous SNPs), calculate their predicted functional importance scores (such as SIFT and PolyPhen scores), retrieve allele frequencies in public databases (such as the 1000 Genomes Project and NHLBI-ESP 6500 exomes), and implement a variants reduction protocol to identify a subset of potentially deleterious variants.
Software that implements rigorous statistical tests to detect bases under selection from a multiple alignment data. It takes full advantage of deeply sequenced phylogenies to estimate both unlikely substitution patterns as well as slowdowns or accelerations in mutation rates. It can be applied as an Hidden Markov Model (HMM), in sliding windows, or to specific regions.
Software that identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. We refer to these deficits as Rejected Substitutions. Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. A false-positive rate (which is user-settable) is calculated using "shuffled" alignments in which the order of columns is randomized.
A software toolkit for RNA sequence data analysis. It contains programs that cover several aspects of RNA-Seq data analysis such as read quality assessment, reference sequence generation, sequence mapping, and gene and isoform expressions estimations.
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 9,2022. Database of information about brain region circuitry, it collates data from the literature on tract tracing studies and provides tools for analysis and visualization of connectivity between brain regions.
Software to detect breakpoints of large deletions, medium sized insertions, inversions, tandem duplications and other structural variants at single-based resolution from next-gen sequence data. It uses a pattern growth approach to identify the breakpoints of these variants from paired-end short reads.
Software for the reliable and accurate identification of somatic point mutations in next generation sequencing data of cancer genomes.