ABSTRACT
Background
Endometriosis is a chronic, estrogen-dependent inflammatory disorder that affects a significant proportion of women of reproductive age. Although the pathophysiology of the disease remains incompletely understood, genetic and hormonal factors are believed to play key roles. Two genes of particular interest in this context are Estrogen Receptor 1 (ESR1) and Growth Regulation by Estrogen in Breast Cancer 1 (GREB1), both of which are integral to estrogen signaling and cell proliferation. This study aimed to investigate the potential contribution of missense Single Nucleotide Polymorphisms (SNPs) in the ESR1 and GREB1 genes to the pathogenesis of endometriosis using an in silico approach.
Materials and Methods
Publicly available data from National Center for Biotechnology Information and SNP database were used to identify missense variants in ESR1 and GREB1. The functional impact of each variant was predicted using six bioinformatics tools: Sorting Intolerant From Tolerant, Polymorphism Phenotyping v2, Protein Variation Effect Analyzer, SNPs and Gene Ontology, Protein Analysis Through Evolutionary Relationships, and PredictSNP. Protein-protein interaction networks were constructed via the Search Tool for the Retrieval of Interacting Genes/Proteins and Gene Multiple Association Network Integration Algorithm platforms, and disease and pathway associations were analyzed using the Kyoto Encyclopedia of Genes and Genomes and DISEASES databases.
Results
ESR1 was found to be a central node in estrogen signaling, with strong predicted interactions with GREB1 and other hormone-regulated genes. Several SNPs in both genes were consistently classified as deleterious across all predictive tools. Disease enrichment analysis further linked these genes to endometriosis, as well as to other estrogen-responsive conditions such as breast and ovarian cancers.
Conclusion
This study identifies potentially high-risk ESR1 and GREB1 variants and highlights their involvement in key estrogen-regulated pathways. These findings support the role of genetic variation in the molecular pathogenesis of endometriosis and lay the groundwork for future experimental validation.
Introduction
Endometriosis is a chronic, estrogen-dependent inflammatory disorder characterized by the presence of functional endometrial tissue outside the uterine cavity. Although the ectopic endometrial lesions are most frequently located within the pelvic region, affecting structures such as the ovaries, pouch of Douglas, sacrouterine ligaments, pelvic peritoneum, rectovaginal septum, and cervix, there are documented cases of extra-pelvic involvement. Rarely, a comma is included endometriotic foci have been identified in organs including the lungs, pleura, diaphragm, intestines, gallbladder, kidneys, ureters, umbilicus, skin, central nervous system, and extremities (1, 2).
The prevalence of endometriosis among women of reproductive age ranges from 3% to 37%, and despite its high frequency and significant impact on quality of life and fertility, the pathogenesis of the disease remains incompletely understood (3). One of the major contributing factors to this knowledge gap is the complex nature of its genetic background. Current evidence suggests a polygenic and multifactorial inheritance pattern, wherein disease development results from a combination of genetic predisposition and environmental influences (4).
Identifying specific genetic contributors is complicated by several factors. The necessity for invasive procedures, such as laparoscopy or laparotomy, for definitive diagnosis limits early detection and may result in underdiagnosis (5).
Furthermore, endometriosis is now considered a heterogeneous condition encompassing multiple subtypes such as superficial peritoneal lesions, ovarian endometriomas, and deeply infiltrating endometriosis, each with potentially distinct genetic and molecular characteristics. Environmental exposures, particularly to endocrine-disrupting chemicals like dioxins, may further influence disease development and expression (6, 7).
In this study, the investigation of genes such as Estrogen Receptor 1 (ESR1) and Growth Regulation by Estrogen in Breast Cancer 1 (GREB1) has gained attention due to their pivotal roles in estrogen signaling, cell proliferation, and endometrial receptivity, all of which are relevant in the etiology and progression of endometriosis (7-11). This study aims to explore the potential contribution of missense Single Nucleotide Polymorphisms (SNPs) in the ESR1 and GREB1 genes to the pathogenesis of endometriosis using a comprehensive in silico bioinformatics approach. By evaluating the functional impact of these genetic variants, mapping protein-protein interactions (PPIs), and analyzing disease-associated pathways, we seek to identify high-risk mutations and elucidate possible molecular mechanisms through which these genes may influence the development and progression of endometriosis.
Materials and Methods
Retrieval of Protein Sequences and Missense Variants for ESR1 and GREB1 Genes
Publicly available data from the National Center for Biotechnology Information (NCBI) and the NCBI Single Nucleotide Polymorphism database (dbSNP) were used to investigate the ESR1 and GREB1 genes associated with endometriosis. Protein sequences and known SNPs for both genes were retrieved and analyzed. The focus was on missense mutations, as these variants result in amino acid changes that may alter the protein’s structure and impair its normal biological function. Such changes can affect processes like hormone binding or gene regulation, which are critical in the pathogenesis of endometriosis. Bioinformatics tools were then applied to evaluate the potential effects of these mutations on protein function (12, 13).
Interaction Analysis of GREB1 and ESR1
To explore the functional and physical interactions involving the GREB1 and ESR1 genes, the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (version 11.5) was employed using a medium confidence interaction score threshold (≥0.4). This platform was used to build a comprehensive PPI network and to predict associations based on known and predicted interactions. In parallel, the Gene Multiple Association Network Integration Algorithm (GeneMANIA) tool (version 3.5.2) was used to further investigate gene-gene relationships and to identify additional genes functionally linked to GREB1 and ESR1. This analysis included co-expression, shared pathways, co-localization, and physical interaction data. The results obtained from GeneMANIA were cross-referenced with the STRING analysis to confirm the consistency and biological relevance of the predicted interactions. All computational analyses were conducted between February 2 and 8, 2025, ensuring up-to-date and reliable data integration (14, 15).
Identifying the Most Deleterious SNPs
To assess the potential functional consequences of non-synonymous SNPs identified in the ESR1 and GREB1 genes, six independent in silico prediction tools were employed: Sorting Intolerant From Tolerant (SIFT) (https://sift.jcvi.org), Protein ANalysis THrough Evolutionary Relationships (PANTHER) (https://www.pantherdb.org/tools), Polymorphism Phenotyping v2 (PolyPhen-2) (https://genetics.bwh.harvard.edu/pph2/), SNPs&Gene Ontology (GO) (https://snps.biofold.org/snps-and-go/), Protein Variation Effect Analyzer (PROVEAN) (https://provean.jcvi.org), and PredictSNP (https://loschmidt.chemi.muni.cz/predictsnp). These tools were used to evaluate the likelihood of deleterious effects caused by each amino acid substitution. Variants that were consistently classified as damaging by all six tools were considered to be high-risk mutations with strong potential to impair protein function. Each tool applies a different algorithm to determine the pathogenicity of SNPs. SIFT utilizes sequence homology to determine whether an amino acid change is tolerated, flagging substitutions with a probability score below 0.05 as deleterious. PANTHER evaluates evolutionary conservation and functional domains to estimate the effect of substitutions. PolyPhen-2 predicts the potential structural and functional consequences of amino acid changes based on multiple sequence alignments and protein structure features. SNPs&GO integrates gene ontology data with machine learning (support vector machine-based) models to associate mutations with disease. PROVEAN applies a sequence-based approach to assess whether amino acid substitutions are functionally disruptive, using a cutoff score of -2.5 to classify variants. Lastly, PredictSNP combines predictions from several algorithms (including SIFT, PolyPhen-2, Multivariate Analysis of Protein Polymorphism, Screening for Non-Acceptable Polymorphisms, and Predictor of Human Deleterious-SNP) to generate a consensus assessment of each SNP’s deleterious potential.
Pathway and Disease Association Analysis of GREB1 and ESR1
Pathway and disease analyses for the GREB1 and ESR1 genes were performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database to explore their roles in essential molecular pathways, particularly those associated with hormone signaling and estrogen-responsive mechanisms relevant to endometriosis. Access to the KEGG pathway data was facilitated through the KEGG application programming interface, allowing systematic mapping of gene functions in biological processes such as estrogen signaling, cell proliferation, and transcriptional regulation.
To complement these findings, disease associations were extracted from the DISEASES database (JensenLab, 2024 version), which provided insight into the clinical relevance of GREB1 and ESR1 in endometriosis and other hormone-related disorders. Additionally, the STRING database was used to construct PPI networks, further validating the involvement of these genes in interconnected regulatory systems. This integrated bioinformatics approach revealed key functional pathways and disease links associated with GREB1 and ESR1 (16-18).
Statistical Analysis
All bioinformatics and in silico statistical analyses were conducted using integrated online platforms and computational tools. Functional predictions of missense variants were obtained from SIFT, PolyPhen-2, PROVEAN, PANTHER, SNPs&GO, and PredictSNP web servers. Protein-protein interaction networks were analyzed via STRING (version 11.5; European Molecular Biology Laboratory, Heidelberg, Germany) and GeneMANIA (version 3.5.2; University of Toronto, Toronto, Canada). Pathway and disease enrichment analyses were performed using the KEGG database (KEGG, Kyoto University, Kyoto, Japan) and DISEASES database (JensenLab, Copenhagen, Denmark). All analyses were performed between February 2 and February 10, 2025, and descriptive statistics were automatically calculated by the respective bioinformatics servers.
Results
Identifying the Most Deleterious SNPs
Although this study primarily focused on missense variants, all listed GREB1 SNPs are intronic and were included due to their potential regulatory relevance as supported by prior literature. These variants were therefore excluded from functional prediction analyses. The initial step of our analysis involved the identification and curation of SNPs within the GREB1 and ESR1 genes, both of which are implicated in estrogen signaling and have been associated with hormone-dependent conditions including endometriosis. Table 1 presents the complete list of selected variants, annotated with reference SNP cluster IDs, allelic composition, ancestral alleles, Human Genome Variation Society nomenclature-compliant transcript-based nomenclature, chromosomal positions, and minor allele frequencies (MAFs). Importantly, all variants listed under ESR1 are exonic and classified as missense mutations, thus, eligible for functional prediction analysis via in silico tools such as SIFT, PolyPhen-2, and PROVEAN. In contrast, all GREB1 variants in our dataset are located in intronic regions, rendering them non-coding and thereby outside the scope of classical missense-based prediction algorithms. Nevertheless, these GREB1 variants were retained due to their high population frequency and potential regulatory roles, as suggested by previous genome-wide association and transcriptomic studies linking GREB1 expression to estrogen-mediated proliferation in endometrial tissues.
Among the ESR1 variants, rs753014570 (c.728G>A) and rs779180038 (c.727C>T) occur in close proximity within the coding sequence, possibly affecting the same functional domain, and may act in tandem as a multi-nucleotide polymorphism in certain haplotypes. Variant rs773500294 also appears as a duplicated entry in public databases, with different reported alternative alleles (C>A and C>G), which requires cautious interpretation due to possible annotation inconsistencies. The low MAFs (<0.01) of several ESR1 variants suggest they may represent rare, potentially pathogenic alterations with relevance to disease susceptibility. These prioritized SNPs served as the foundation for downstream analyses, including PPI mapping and disease association profiling.
Interaction Analysis of GREB1 and ESR1
PPI analysis revealed that ESR1 occupies a central position within the interaction network, engaging in numerous functional associations with other proteins relevant to estrogen signaling and transcriptional regulation. Notably, GREB1 and its paralog GREB1L demonstrated strong connectivity with ESR1, supporting their known roles as estrogen-responsive genes. The presence of thick interaction lines indicates high-confidence associations, suggesting a direct regulatory relationship. Similarly, a prominent interaction was observed between ESR1 and progesterone receptor (PGR), highlighting the interplay between estrogen and progesterone pathways in hormone-regulated tissues (Figure 1).
The corresponding interaction network is presented in Figure 1. In the GeneMANIA-derived visualization, different edge colors represent distinct types of functional associations: pink lines indicate co-expression, blue lines denote physical interactions, green lines correspond to co-localization, and orange lines reflect predicted interactions. These integrated networks provide evidence for the functional linkage between ESR1 and GREB1, particularly within estrogen-responsive signaling pathways.
Disease association analysis performed using the DISEASES database (JensenLab) revealed that both ESR1 and GREB1 are strongly linked to a variety of hormone-dependent and estrogen-responsive conditions. ESR1 showed high-confidence associations with several diseases, most notably breast cancer (Z: 9.0), carcinoma (Z: 7.4), endometriosis (Z: 7.1), and ovarian cancer (Z: 6.6). These associations reflect ESR1’s pivotal role in estrogen signaling, transcriptional regulation, and reproductive tissue homeostasis.
Similarly, GREB1—a gene regulated by ESR1 and known to mediate estrogen-stimulated cell proliferation—also demonstrated associations with estrogen-sensitive pathologies. The strongest connections were observed with breast cancer (Z: 5.3), endometriosis (Z: 4.7), amelogenesis imperfecta type 1G (Z: 4.6); and various gynecologic malignancies such as uterine cancer, ovarian cancer, and uterine fibroids (Figures 2 and 3).
Collectively, these findings reinforce the functional interplay between ESR1 and GREB1 in estrogen-regulated pathways and highlight their shared involvement in the pathogenesis of endometriosis and other hormone-related disorders.
Figure 4 shows the representation of the estrogen signaling pathway based on the KEGG pathway map. The pathway includes both membrane-initiated and nuclear-initiated steroid signaling mechanisms. ESR1 acts as a central transcription factor activated by estrogen, leading to downstream signaling events including activation of MAPK/ERK and PI3K/AKT pathways. GREB1, indicated as a target gene, is transcriptionally regulated by ESR1 upon estrogen binding, suggesting its role as a downstream effector in estrogen-dependent biological processes such as cell proliferation, differentiation, and survival.
Discussion
In this study, a comprehensive in silico analysis was performed to investigate the potential contribution of missense SNPs in the ESR1 and GREB1 genes to the pathogenesis of endometriosis. These genes were selected due to their critical roles in estrogen signaling, cell proliferation, and reproductive tissue regulation, all of which are highly relevant to the etiology of endometriosis (7-11). By integrating data from multiple bioinformatics platforms—including SNP prediction tools, PPI networks, and disease association databases—we sought to identify high-risk variants that may influence disease susceptibility and progression.
Our PPI analysis revealed that ESR1 serves as a central hub within the estrogen signaling network, demonstrating strong associations with GREB1 and other key genes such as PGR, CYP1B1, and CTNNB1 (14, 15). These interactions support previous findings that ESR1 and GREB1 are not only co-expressed but also functionally interlinked in hormone-responsive pathways (8, 10, 11).
Further connections between ESR1 and components of the RNA polymerase II complex (including POLR2A, POLR2F, POLR2J, among others) emphasize its role in the transcriptional activation of downstream target genes. Additionally, interactions with genes such as CYP1B1, TFF1, CTNNB1, and SAFB reflect ESR1’s broad involvement in cellular processes including hormone metabolism, cell proliferation, and chromatin remodeling (14, 16). In addition to the molecular pathway relevance of these genes, the clinical significance of the identified variants was also examined. To further contextualize the relevance of the identified SNPs, we explored existing literature and variant databases to determine whether these polymorphisms have previously been associated with endometriosis or other estrogen-dependent conditions. While none of the ESR1 or GREB1 variants listed in Table 1 has been directly linked to endometriosis in large genome-wide association studies, some—such as ESR1 rs753014570 (c.728G>A)—have been implicated in hormone-responsive cancers including breast and ovarian cancer, where dysregulated estrogen signaling is a common pathological feature (19, 20). This overlap is noteworthy, given the shared molecular mechanisms between these diseases and endometriosis, including estrogen-driven proliferation, progesterone resistance, and inflammatory microenvironment remodeling. Additionally, the low-frequency variants identified in ESR1 (e.g., rs779180038, rs746521050) may represent rare, potentially functional mutations that could alter receptor conformation, DNA binding affinity, or cofactor recruitment, ultimately influencing downstream gene transcription. Although the GREB1 variants identified in this study are intronic and have not been directly associated with endometriosis, prior evidence suggests that regulatory SNPs in intronic regions can affect gene expression via splicing efficiency, enhancer disruption, or transcription factor binding site modulation (21, 22). Therefore, these variants may contribute to altered GREB1 expression levels in estrogen-responsive tissues. Future experimental validation and population-based association studies are required to assess the biological significance of these candidate variants in endometriosis pathogenesis (23, 24). The functional link between ESR1 and GREB1, in particular, underscores a shared role in estrogen-mediated gene expression, suggesting that genetic variants affecting these proteins may contribute to the molecular pathology of endometriosis (10, 11). The rationale for selecting ESR1 and GREB1 in this study stems from their well-established roles in estrogen signaling, which is central to the pathogenesis of endometriosis (25, 26). ESR1 encodes Estrogen Receptor α (ERα), a nuclear hormone receptor that regulates the transcription of estrogen-responsive genes upon ligand binding (27, 28). GREB1 is one such early response gene directly upregulated by ESR1 via estrogen-bound ERα complexes (29). Multiple studies have demonstrated that GREB1 expression is tightly correlated with estrogen stimulation in hormone-responsive tissues including the endometrium and that it functions as a key mediator of estrogen-driven cellular proliferation and differentiation (30-32). Specifically, chromatin immunoprecipitation assays have shown that ERα binds to enhancer regions within the GREB1 gene locus, activating its transcription (33). This regulatory axis is critical in endometrial biology, as dysregulation of estrogen signaling is known to promote the ectopic growth and invasiveness characteristic of endometriotic lesions. Therefore, the functional interplay between ESR1 and GREB1 reflects a direct transcriptional hierarchy, wherein polymorphisms in either gene may disrupt normal hormonal responses, leading to altered gene expression patterns that favor the development or persistence of endometriosis (8-34,35).
Several missense mutations in both ESR1 and GREB1 were identified, some of which were predicted to be deleterious across multiple algorithms. Variants such as rs779180038 and rs753014570, although classified as multi-nucleotide variants with ambiguous impact, highlight the complexity of interpreting in silico predictions and the necessity for future experimental validation. These findings suggest that specific SNPs may alter protein structure or function, potentially disrupting ER activity or its downstream gene targets (12-13).
Pathway and disease enrichment analyses supported these observations, linking ESR1 and GREB1 not only to endometriosis but also to other estrogen-dependent conditions such as breast cancer, ovarian cancer, and uterine fibroids (16, 17). These overlapping associations underline the shared molecular mechanisms underlying these diseases and reinforce the importance of studying ESR1 and GREB1 in a broader hormonal context (7-9).
Collectively, our results emphasize the value of integrated bioinformatics approaches in identifying candidate variants for further investigation. While in silico predictions provide important insights, they should be followed by functional assays and population-based studies to validate the clinical relevance of the identified mutations. Understanding how these genes and their variants contribute to estrogen signaling and endometrial pathophysiology may ultimately aid in the development of more personalized diagnostic and therapeutic strategies for endometriosis.
Conclusion
Although silico-based approaches cannot fully replace experimental validation, they serve as valuable tools for prioritizing candidate variants for further functional and clinical research. The integration of these results with future laboratory and population-level studies may enhance our understanding of endometriosis and facilitate the development of targeted diagnostic and therapeutic strategies.