[sider, chemogenomics_hub, bindingdb_ligand, bindingdb_protein, uniprot_hub, gi2uniprot, gi, pubchem_bioassay]

[side effects --> sider --> chemogenomics_hub --> bindingdb_ligand --> bindingdb_protein --> uniprot_hub --> gi2uniprot --> gi --> pubchem_bioassay --> biologcal/cellular assay]

[Terminus(User Input)DATASOURCE1 --> DATASOURCE2 --> DATASOURCE3 --> ... DATASOURCE7Terminus(User Input)]

Step 1: Identify the inlink/outlink for each data source


DrugBankDrug.Inlink: CID

DrugBankDrug.Outlink: DBID


DrugBankTarget.Inlink: DBID

DrugBankDrug.Outlink GeneSymbol


Step 2: Generate Partial SPARQL for Each Data Source

?PrimaryKey DataSource:InLink ? InLink
?PrimaryKey DataSource:InLink ? OutLink

Exceptional Case
PrimaryKey is either one of InLink and OutLink
?PrimaryKey DataSource: [InLink, Outlink]/?PrimaryKey ? [InLink, Outlink]/?PrimaryKey

Step 3: Iterate through the Link

Exceptional Case
  1. The first data source only contains outlink
  2. Similarly, the last one only contains inlink
  3. Two terminal data points are determined by user inputs
  • ?PrimaryKey DataSourceStart:Terminus(User Input) ? Terminus(User Input)
  • Prepend to the generated SPARQL

  • ?PrimaryKey DataSourceEnd:Terminus(User Input) ? Terminus(User Input)
  • Append to the generated SPARQL

Steps 1 to 3 has been implemented

Step 4: Add the Prefix (SELECT, WHERE, FILTER)

Crucial Implementation Issues
  1. Need to build more interface capability for event/handling
  2. Whether to select all the intermediate relevant information and present it back to the user
  3. Filter and count are more complicated