Paralogs are genes which arose via gene duplication, and when such paralogs retain overlapping or redundant function, this poses a challenge to functional genetics research. Recent technological advancements have made it possible to systematically probe gene function for redundant genes using dual or multiplex gene perturbation, and there is a need for a simple bioinformatic tool to identify putative paralogs of a gene(s) of interest. We have developed Paralog Explorer (https://www.flyrnai.org/tools/paralogs/), an online resource that allows researchers to quickly and accurately identify candidate paralogous genes in the genomes of the model organisms D. melanogaster, C. elegans, D. rerio, M. musculus, and H. sapiens. Paralog Explorer deploys an effective between-species ortholog prediction software, DIOPT, to analyze within-species paralogs. Paralog Explorer allows users to identify candidate paralogs, and to navigate relevant databases regarding gene co-expression, protein–protein and genetic interaction, as well as gene ontology and phenotype annotations. Altogether, this tool extends the value of current ortholog prediction resources by providing sophisticated features useful for identification and study of paralogous genes. 2022 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY license (http://creativecommons.org/liscenses/by/4.0/).
Partial loss-of-function mutations in glycosylation pathways underlie a set of rare diseases called Congenital Disorders of Glycosylation (CDGs). In particular, DPAGT1-CDG is caused by mutations in the gene encoding the first step in N-glycosylation, DPAGT1, and this disorder currently lacks effective therapies. To identify potential therapeutic targets for DPAGT1- CDG, we performed CRISPR knockout screens in Drosophila cells for genes associated with better survival and glycoprotein levels under DPAGT1 inhibition. We identified hundreds of candidate genes that may be of therapeutic benefit. Intriguingly, inhibition of the mannosyltransferase Dpm1, or its downstream glycosylation pathways, could rescue two in vivo models of DPAGT1 inhibition and ER stress, even though impairment of these pathways alone usually causes CDGs. While both in vivo models ostensibly cause cellular stress (through DPAGT1 inhibition or a misfolded protein), we found a novel difference in fructose metabolism that may indicate glycolysis as a modulator of DPAGT1-CDG. Our results provide new therapeutic targets for DPAGT1-CDG, include the unique finding of Dpm1-related pathways rescuing DPAGT1 inhibition, and reveal a novel interaction between fructose metabolism and ER stress.
Naturally produced peptides (<100 amino acids) are important regulators of physiology, development, and metabolism. Recent studies have predicted that thousands of peptides may be translated from transcripts containing small open reading frames (smORFs). Here, we describe two peptides in Drosophila encoded by conserved smORFs, Sloth1 and Sloth2. These peptides are translated from the same bicistronic transcript and share sequence similarities, suggesting that they encode paralogs. Yet, Sloth1 and Sloth2 are not functionally redundant, and loss of either peptide causes animal lethality, reduced neuronal function, impaired mitochondrial function, and neurodegeneration. We provide evidence that Sloth1/2 are highly expressed in neurons, imported to mitochondria, and regulate mitochondrial complex III assembly. These results suggest that phenotypic analysis of smORF genes in Drosophila can provide a wealth of information on the biological functions of this poorly characterized class of genes.
Entomopathogenic nematodes are widely used as biopesticides1,2. Their insecticidal activity depends on symbiotic bacteria such as Photorhabdus luminescens, which produces toxin complex (Tc) toxins as major virulence factors3-6. No protein receptors are known for any Tc toxins, which limits our understanding of their specificity and pathogenesis. Here we use genome-wide CRISPR-Cas9-mediated knockout screening in Drosophila melanogaster S2R+ cells and identify Visgun (Vsg) as a receptor for an archetypal P. luminescens Tc toxin (pTc). The toxin recognizes the extracellular O-glycosylated mucin-like domain of Vsg that contains high-density repeats of proline, threonine and serine (HD-PTS). Vsg orthologues in mosquitoes and beetles contain HD-PTS and can function as pTc receptors, whereas orthologues without HD-PTS, such as moth and human versions, are not pTc receptors. Vsg is expressed in immune cells, including haemocytes and fat body cells. Haemocytes from Vsg knockout Drosophila are resistant to pTc and maintain phagocytosis in the presence of pTc, and their sensitivity to pTc is restored through the transgenic expression of mosquito Vsg. Last, Vsg knockout Drosophila show reduced bacterial loads and lethality from P. luminescens infection. Our findings identify a proteinaceous Tc toxin receptor, reveal how Tc toxins contribute to P. luminescens pathogenesis, and establish a genome-wide CRISPR screening approach for investigating insecticidal toxins and pathogens.
Whole-exome sequencing of two patients with idiopathic complex neurodevelopmental disorder (NDD) identified biallelic variants of unknown significance within FIBCD1, encoding an endocytic acetyl group-binding transmembrane receptor with no known function in the central nervous system. We found that FIBCD1 preferentially binds and endocytoses glycosaminoglycan (GAG) chondroitin sulphate-4S (CS-4S) and regulates GAG content of the brain extracellular matrix (ECM). In silico molecular simulation studies and GAG binding analyses of patient variants determined that such variants are loss-of-function by disrupting FIBCD1-CS-4S association. Gene knockdown in flies resulted in morphological disruption of the neuromuscular junction and motor-related behavioural deficits. In humans and mice, FIBCD1 is expressed in discrete brain regions, including the hippocampus. Fibcd1 KO mice exhibited normal hippocampal neuronal morphology but impaired hippocampal-dependent learning. Further, hippocampal synaptic remodelling in acute slices from Fibcd1 KO mice was deficient but restored upon enzymatically modulating the ECM. Together, we identified FIBCD1 as an endocytic receptor for GAGs in the brain ECM and a novel gene associated with an NDD, revealing a critical role in nervous system structure, function and plasticity.
Organ functions are highly specialized and interdependent. Secreted factors regulate organ development and mediate homeostasis through serum trafficking and inter-organ communication. Enzyme-catalysed proximity labelling enables the identification of proteins within a specific cellular compartment. Here, we report a BirA*G3 mouse strain that enables CRE-dependent promiscuous biotinylation of proteins trafficking through the endoplasmic reticulum. When broadly activated throughout the mouse, widespread labelling of proteins was observed within the secretory pathway. Streptavidin affinity purification and peptide mapping by quantitative mass spectrometry (MS) proteomics revealed organ-specific secretory profiles and serum trafficking. As expected, secretory proteomes were highly enriched for signal peptide-containing proteins, highlighting both conventional and non-conventional secretory processes, and ectodomain shedding. Lower-abundance proteins with hormone-like properties were recovered and validated using orthogonal approaches. Hepatocyte-specific activation of BirA*G3 highlighted liver-specific biotinylated secretome profiles. The BirA*G3 mouse model demonstrates enhanced labelling efficiency and tissue specificity over viral transduction approaches and will facilitate a deeper understanding of secretory protein interplay in development, and in healthy and diseased adult states.
Animal cell lines often undergo extreme genome restructuring events, including polyploidy and segmental aneuploidy that can impede de novo whole-genome assembly (WGA). In some species like Drosophila, cell lines also exhibit massive proliferation of transposable elements (TEs). To better understand the role of transposition during animal cell culture, we sequenced the genome of the tetraploid Drosophila S2R+ cell line using long-read and linked-read technologies. WGAs for S2R+ were highly fragmented and generated variable estimates of TE content across sequencing and assembly technologies. We therefore developed a novel WGA-independent bioinformatics method called TELR that identifies, locally assembles, and estimates allele frequency of TEs from long-read sequence data (https://github.com/bergmanlab/telr). Application of TELR to a ∼130x PacBio dataset for S2R+ revealed many haplotype-specific TE insertions that arose by transposition after initial cell line establishment and subsequent tetraploidization. Local assemblies from TELR also allowed phylogenetic analysis of paralogous TEs, which revealed that proliferation of TE families in vitro can be driven by single or multiple source lineages. Our work provides a model for the analysis of TEs in complex heterozygous or polyploid genomes that are recalcitrant to WGA and yields new insights into the mechanisms of genome evolution in animal cell culture.
The pathophysiological effects of a number of metabolic and age-related disorders can be prevented to some extent by exercise and increased physical activity. However, the molecular mechanisms that contribute to the beneficial effects of muscle activity remain poorly explored. Availability of a fast, inexpensive, and genetically tractable model system for muscle activity and exercise will allow the rapid identification and characterization of molecular mechanisms that mediate the beneficial effects of exercise. Here, we report the development and characterization of an optogenetically-inducible muscle contraction (OMC) model in Drosophila larvae that we used to study acute exercise-like physiological responses. To characterize muscle-specific transcriptional responses to acute exercise, we performed bulk mRNA-sequencing, revealing striking similarities between acute exercise-induced genes in flies and those previously identified in humans. Our larval muscle contraction model opens a path for rapid identification and characterization of exercise-induced factors.
Mechanistic target of rapamycin complex 1 (mTORC1) regulates cell growth and metabolism in response to multiple nutrients, including the essential amino acid leucine1. Recent work in cultured mammalian cells established the Sestrins as leucine-binding proteins that inhibit mTORC1 signalling during leucine deprivation2,3, but their role in the organismal response to dietary leucine remains elusive. Here we find that Sestrin-null flies (Sesn-/-) fail to inhibit mTORC1 or activate autophagy after acute leucine starvation and have impaired development and a shortened lifespan on a low-leucine diet. Knock-in flies expressing a leucine-binding-deficient Sestrin mutant (SesnL431E) have reduced, leucine-insensitive mTORC1 activity. Notably, we find that flies can discriminate between food with or without leucine, and preferentially feed and lay progeny on leucine-containing food. This preference depends on Sestrin and its capacity to bind leucine. Leucine regulates mTORC1 activity in glial cells, and knockdown of Sesn in these cells reduces the ability of flies to detect leucine-free food. Thus, nutrient sensing by mTORC1 is necessary for flies not only to adapt to, but also to detect, a diet deficient in an essential nutrient.
Previously, we described a large collection of Drosophila strains that each carry an artificial exon containing a T2AGAL4 cassette inserted in an intron of a target gene based on CRISPR-mediated homologous recombination. These alleles permit numerous applications and have proven to be very useful. Initially, the homologous recombination-based donor constructs had long homology arms (>500 bps) to promote precise integration of large constructs (>5 kb). Recently, we showed that in vivo linearization of the donor constructs enables insertion of large artificial exons in introns using short homology arms (100-200 bps). Shorter homology arms make it feasible to commercially synthesize homology donors and minimize the cloning steps for donor construct generation. Unfortunately, about 58% of Drosophila genes lack a suitable coding intron for integration of artificial exons in all of the annotated isoforms. Here, we report the development of new set of constructs that allow the replacement of the coding region of genes that lack suitable introns with a KozakGAL4 cassette, generating a knock-out/knock-in allele that expresses GAL4 similarly as the targeted gene. We also developed custom vector backbones to further facilitate and improve transgenesis. Synthesis of homology donor constructs in custom plasmid backbones that contain the target gene sgRNA obviates the need to inject a separate sgRNA plasmid and significantly increases the transgenesis efficiency. These upgrades will enable the targeting of nearly every fly gene, regardless of exon-intron structure, with a 70-80% success rate.
Recent advances in single-cell sequencing provide a unique opportunity to gain novel insights into the diversity, lineage, and functions of cell types constituting a tissue/organ. Here, we performed a single-nucleus study of the adult Drosophila renal system, consisting of Malpighian tubules and nephrocytes, which shares similarities with the mammalian kidney. We identified 11 distinct clusters representing renal stem cells, stellate cells, regionally specific principal cells, garland nephrocyte cells, and pericardial nephrocytes. Characterization of the transcription factors specific to each cluster identified fruitless (fru) as playing a role in stem cell regeneration and Hepatocyte nuclear factor 4 (Hnf4) in regulating glycogen and triglyceride metabolism. In addition, we identified a number of genes, including Rho guanine nucleotide exchange factor at 64C (RhoGEF64c), Frequenin 2 (Frq2), Prip, and CG1093 that are involved in regulating the unusual star shape of stellate cells. Importantly, the single-nucleus dataset allows visualization of the expression at the organ level of genes involved in ion transport and junctional permeability, providing a systems-level view of the organization and physiological roles of the tubules. Finally, a cross-species analysis allowed us to match the fly kidney cell types to mouse kidney cell types and planarian protonephridia, knowledge that will help the generation of kidney disease models. Altogether, our study provides a comprehensive resource for studying the fly kidney.
Insulin signaling promotes anabolic metabolism to regulate cell growth through multi-omic interactions. To obtain a comprehensive view of the cellular responses to insulin, we constructed a trans-omic network of insulin action in Drosophila cells that involves the integration of multi-omic data sets. In this network, 14 transcription factors, including Myc, coordinately upregulate the gene expression of anabolic processes such as nucleotide synthesis, transcription, and translation, consistent with decreases in metabolites such as nucleotide triphosphates and proteinogenic amino acids required for transcription and translation. Next, as cell growth is required for cell proliferation and insulin can stimulate proliferation in a context-dependent manner, we integrated the trans-omic network with results from a CRISPR functional screen for cell proliferation. This analysis validates the role of a Myc-mediated subnetwork that coordinates the activation of genes involved in anabolic processes required for cell growth.
The Alliance of Genome Resources (the Alliance) is a combined effort of 7 knowledgebase projects: Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource. The Alliance seeks to provide several benefits: better service to the various communities served by these projects; a harmonized view of data for all biomedical researchers, bioinformaticians, clinicians, and students; and a more sustainable infrastructure. The Alliance has harmonized cross-organism data to provide useful comparative views of gene function, gene expression, and human disease relevance. The basis of the comparative views is shared calls of orthology relationships and the use of common ontologies. The key types of data are alleles and variants, gene function based on gene ontology annotations, phenotypes, association to human disease, gene expression, protein-protein and genetic interactions, and participation in pathways. The information is presented on uniform gene pages that allow facile summarization of information about each gene in each of the 7 organisms covered (budding yeast, roundworm Caenorhabditis elegans, fruit fly, house mouse, zebrafish, brown rat, and human). The harmonized knowledge is freely available on the alliancegenome.org portal, as downloadable files, and by APIs. We expect other existing and emerging knowledge bases to join in the effort to provide the union of useful data and features that each knowledge base currently provides.
For more than 100 years, the fruit fly Drosophila melanogaster has been one of the most studied model organisms. Here, we present a single-cell atlas of the adult fly, Tabula Drosophilae, that includes 580,000 nuclei from 15 individually dissected sexed tissues as well as the entire head and body, annotated to >250 distinct cell types. We provide an in-depth analysis of cell type-related gene signatures and transcription factor markers, as well as sexual dimorphism, across the whole animal. Analysis of common cell types between tissues, such as blood and muscle cells, reveals rare cell types and tissue-specific subtypes. This atlas provides a valuable resource for the Drosophila community and serves as a reference to study genetic perturbations and disease models at single-cell resolution.
Insect salivary glands have been previously shown to function in pupal attachment and food lubrication by secreting factors into the lumen via an exocrine way. Here, we find in Drosophila that a salivary gland-derived secreted factor (Sgsf) peptide regulates systemic growth via an endocrine way. Sgsf is specifically expressed in salivary glands and secreted into the hemolymph. Sgsf knockout or salivary gland-specific Sgsf knockdown decrease the size of both the body and organs, phenocopying the effects of genetic ablation of salivary glands, while salivary gland-specific Sgsf overexpression increases their size. Sgsf promotes systemic growth by modulating the secretion of the insulin-like peptide Dilp2 from the brain insulin-producing cells (IPCs) and affecting mechanistic target of rapamycin (mTOR) signaling in the fat body. Altogether, our study demonstrates that Sgsf mediates the roles of salivary glands in Drosophila systemic growth, establishing an endocrine function of salivary glands.
Adaptation to nutrient scarcity involves an orchestrated response of metabolic and signaling pathways to maintain homeostasis. We find that in the fat body of fasting Drosophila, lysosomal export of cystine coordinates remobilization of internal nutrient stores with reactivation of the growth regulator target of rapamycin complex 1 (TORC1). Mechanistically, cystine was reduced to cysteine and metabolized to acetyl-coenzyme A (acetyl-CoA) by promoting CoA metabolism. In turn, acetyl-CoA retained carbons from alternative amino acids in the form of tricarboxylic acid cycle intermediates and restricted the availability of building blocks required for growth. This process limited TORC1 reactivation to maintain autophagy and allowed animals to cope with starvation periods. We propose that cysteine metabolism mediates a communication between lysosomes and mitochondria, highlighting how changes in diet divert the fate of an amino acid into a growth suppressive program.
Expansion of the available repertoire of reagents for visualization and manipulation of proteins will help understand their function. Short epitope tags linked to proteins of interest and recognized by existing binders such as nanobodies facilitate protein studies by obviating the need to isolate new antibodies directed against them. Nanobodies have several advantages over conventional antibodies, as they can be expressed and used as tools for visualization and manipulation of proteins in vivo. Here, we characterize two short (<15aa) NanoTag epitopes, 127D01 and VHH05, and their corresponding high-affinity nanobodies. We demonstrate their use in Drosophila for in vivo protein detection and re-localization, direct and indirect immunofluorescence, immunoblotting, and immunoprecipitation. We further show that CRISPR-mediated gene targeting provides a straightforward approach to tagging endogenous proteins with the NanoTags. Single copies of the NanoTags, regardless of their location, suffice for detection. This versatile and validated toolbox of tags and nanobodies will serve as a resource for a wide array of applications, including functional studies in Drosophila and beyond.
Stem cells constantly divide and differentiate to maintain adult tissue homeostasis, and uncontrolled stem cell proliferation leads to severe diseases such as cancer. How stem cell proliferation is precisely controlled remains poorly understood. Here, from an RNA interference (RNAi) screen in adult Drosophila intestinal stem cells (ISCs), we identify a factor, Yun, required for proliferation of normal and transformed ISCs. Yun is mainly expressed in progenitors; our genetic and biochemical evidence suggest that it acts as a scaffold to stabilize the Prohibitin (PHB) complex previously implicated in various cellular and developmental processes and diseases. We demonstrate that the Yun/PHB complex is regulated by and acts downstream of EGFR/MAPK signaling. Importantly, the Yun/PHB complex interacts with and positively affects the levels of the transcription factor E2F1 to regulate ISC proliferation. In addition, we find that the role of the PHB complex in cell proliferation is evolutionarily conserved. Thus, our study uncovers a Yun/PHB-E2F1 regulatory axis in stem cell proliferation.
Multicellular organisms rely on cell-cell communication to exchange information necessary for developmental processes and metabolic homeostasis. Cell-cell communication pathways can be inferred from transcriptomic datasets based on ligand-receptor expression. Recently, data generated from single-cell RNA sequencing have enabled ligand-receptor interaction predictions at an unprecedented resolution. While computational methods are available to infer cell-cell communication in vertebrates such a tool does not yet exist for Drosophila. Here, we generated a high-confidence list of ligand-receptor pairs for the major fly signaling pathways and developed FlyPhoneDB, a quantification algorithm that calculates interaction scores to predict ligand-receptor interactions between cells. At the FlyPhoneDB user interface, results are presented in a variety of tabular and graphical formats to facilitate biological interpretation. To illustrate that FlyPhoneDB can effectively identify active ligands and receptors to uncover cell-cell communication events, we applied FlyPhoneDB to Drosophila single-cell RNA sequencing data sets from adult midgut, abdomen, and blood, and demonstrate that FlyPhoneDB can readily identify previously characterized cell-cell communication pathways. Altogether, FlyPhoneDB is an easy-to-use framework that can be used to predict cell-cell communication between cell types from single-cell RNA sequencing data in Drosophila.