We present HerbKG
, a knowledge graph that bridges herbal and molecular medicine. The core bio-entities of HerbKG include herbs, chemicals extracted from the herbs, genes that are affected by the chemicals, and diseases treated by herbs due to the functions of genes. We have developed a learning framework to automate the process of HerbKG construction. The resulting knowledge graph provides extensive herbal-molecular domain knowledge in support of downstream applications.
The Herb Ontology
The herb ontology consists of four entity types, including Herb, Chemical, Disease, and Gene, with five coarse-grained relation types, including HerbHasCompoundChemical (HHC), HerbTreatsDisease (HTD), ChemicalActsOnDisease (CAD), ChemicalAssociatesGene (CAG), and GeneInfluencesDisease (GID). A description of the entity types is as follows.
Herbs
in this study can be a part or produced from parts of a plant (either fresh or dried), including the leafy green or flowering parts, seeds, bark, roots and fruits. Examples include "abrus precatorius", "ginkgo biloba", "salvia officinalis", and "cinnamomum cassia".Chemicals
refer to chemical compounds that can be used as medicine. In our study, we mainly focus on the chemicals extracted from herbs. Examples include "essential amino acids", "isoflavanquinones", "diphenhydramine", and "abruquinone A".Disease
refers to a particular abnormal condition that negatively affects the structure or function of all or part of an organism, and that is not due to any immediate external injury. Examples include "anemia", "otoconia", "hypoosmotic swelling", and "gastric ulcer".Gene
refers to a basic unit of heredity and a sequence of nucleotides in DNA or RNA that encodes the synthesis of a gene product, either RNA or protein. Examples include "caspase-3", "AP-1", "Bax", and "cytochrome c".We also provide a description of the relation types below:
HerbHasCompoundChemical
describes a containment relation between a herb and a chemical, which is extracted from the herb. A herb may contain one or more chemicals that can be used for medical purposes. For example, cassia barks contains cinnamaldehyde.HerbTreatsDisease
indicates that a herb has positive effect on the treatment of a disease.ChemicalActsOnDisease
refers to a relation between a chemical and a disease. The effect of the chemical on the disease can be either positive or negative. CAD allows us to understand which chemical extracted from the herb causes what effect on the disease.ChemicalAssociatesGene
describes an association between a chemical and a gene. For example, a study (DOI: 10.3892/or.2015.4493) shows that cinnamaldehyde (a chemical) can inhibit the PI3K/Akt (a gene) signaling pathway, inducing apoptosis and affecting the biological behavior of human colorectal cancer cells.GeneInfluencesDisease
indicates a connection between a gene and a disease. For example, a study (DOI: 10.1254/jphs.FP0061204) shows that AP-1 (a gene) inactivation can inhibit SW620 colon cancer (a disease) cell growth.https://github.com/FeiYee/HerbKG
A proof-of-concept system is hosted at: http://43.129.228.234:7474/browser/