[sider, chemogenomics_hub, bindingdb_ligand, bindingdb_protein, uniprot_hub, gi2uniprot, gi, pubchem_bioassay]

[side effects --> sider --> chemogenomics_hub --> bindingdb_ligand --> bindingdb_protein --> uniprot_hub --> gi2uniprot --> gi --> pubchem_bioassay --> biologcal/cellular assay]


[Terminus(User Input)DATASOURCE1 --> DATASOURCE2 --> DATASOURCE3 --> ... DATASOURCE7Terminus(User Input)]

Step 1: Identify the inlink/outlink for each data source

EXAMPLE 1

DrugBankDrug.Inlink: CID

DrugBankDrug.Outlink: DBID

EXAMPLE 2

DrugBankTarget.Inlink: DBID

DrugBankDrug.Outlink GeneSymbol

SCHEMA.GIF

Step 2: Generate Partial SPARQL for Each Data Source


Format
?PrimaryKey DataSource:InLink ? InLink
?PrimaryKey DataSource:InLink ? OutLink

Exceptional Case
PrimaryKey is either one of InLink and OutLink
?PrimaryKey DataSource: [InLink, Outlink]/?PrimaryKey ? [InLink, Outlink]/?PrimaryKey

Step 3: Iterate through the Link

Exceptional Case
  1. The first data source only contains outlink
  2. Similarly, the last one only contains inlink
  3. Two terminal data points are determined by user inputs
  • ?PrimaryKey DataSourceStart:Terminus(User Input) ? Terminus(User Input)
  • Prepend to the generated SPARQL

  • ?PrimaryKey DataSourceEnd:Terminus(User Input) ? Terminus(User Input)
  • Append to the generated SPARQL

Steps 1 to 3 has been implemented



Step 4: Add the Prefix (SELECT, WHERE, FILTER)

Crucial Implementation Issues
  1. Need to build more interface capability for event/handling
  2. Whether to select all the intermediate relevant information and present it back to the user
  3. Filter and count are more complicated