Menu Expand
Knowledge-based Expert Systems in Chemistry

Knowledge-based Expert Systems in Chemistry

Philip Judson

(2019)

Additional Information

Abstract

There have been significant developments in the use of knowledge-based expert systems in chemistry since the first edition of this book was published in 2009. This new edition has been thoroughly revised and updated to reflect the advances.

The underlying theme of the book is still the need for computer systems that work with uncertain or qualitative data to support decision-making based on reasoned judgements. With the continuing evolution of regulations for the assessment of chemical hazards, and changes in thinking about how scientific decisions should be made, that need is ever greater. Knowledge-based expert systems are well established in chemistry, especially in relation to toxicology, and they are used routinely to support regulatory submissions. The effectiveness and continued acceptance of computer prediction depends on our ability to assess the trustworthiness of predictions and the validity of the models on which they are based.

Written by a pioneer in the field, this book provides an essential reference for anyone interested in the uses of artificial intelligence for decision making in chemistry.


The author studied chemistry at the University of Manchester before working on the synthesis of novel herbicides and fungicides for Fisons Ltd at Chesterford Park Research Station near Saffron Walden. His PhD at the University of Surrey was on chemical synthesis. He took an interest in knowledge-based computer systems and became Head of Chemical Information and Computing for Schering Agrochemicals Ltd. He was one of the founders of Lhasa Limited, a not-for-profit company specialising in knowledge-based expert systems in chemistry including the widely-used Derek, Meteor, and Zeneth systems for predicting chemical toxicity, metabolism, and chemical degradation. Although semi-retired, he continues to contribute to research and development work at Lhasa Limited in his role as Scientific Advisor and is working in a project on synthetic accessibility led by scientists at the US Niational Institutes of Health. He developed and maintains software for chemical hazard classification and chemical safety data sheet management, Harmoneus and Prometheus, which are supplied by Hibiscus plc. He has published over eighty scientfic papers, posters and book chapters. His hobbies include climbing and caving and he has published articles about international caving expeditions that he has taken part in.

Table of Contents

Section Title Page Action Price
Cover Cover
Preface v
Contents vii
Chapter 1 Artificial Intelligence - Making Use of Reasoning 1
References 5
Chapter 2 Synthesis Planning by Computer 6
References 14
Chapter 3 Other Programs to Support Chemical Synthesis Planning 15
3.1 Programs That Are Similar to LHASA in Their Approach 15
3.1.1 SECS 15
3.1.2 PASCOP 16
3.1.3 SYNLMA 16
3.1.4 SYNCHEM and SYNCHEM2 17
3.1.5 SYNGEN 19
3.1.6 SYNSUP-MB and CAOSP 21
3.1.7 RESYN 21
3.1.8 SOS, MARSEIL, CONAN, HOLOWin and GRAAL 22
3.1.9 AIPHOS, SOPHIA and KOSP 23
3.1.10 Chiron 24
3.1.11 PSYCHO 24
3.1.12 COMPASS 25
3.1.13 Wipke and Rogers SST 25
3.1.14 SESAM 26
3.2 CICLOPS, EROS and WODCA - A Different Approach 26
3.3 PIRExS 28
3.4 COSYMA 29
3.5 Work by Wilcox and Levinson - Automated Rule Discovery 29
3.6 Predicting Reactions 31
3.6.1 CAMEO 31
3.6.2 Work by Chen and Baldi 31
3.7 What Happened to Synthesis Planning by Computer? 32
References 35
Chapter 4 International Repercussions of the Harvard LHASA Project 39
References 44
Chapter 5 Current Interest in Synthesis Planning by Computer 46
5.1 Retrosynthetic Analysis 46
5.1.1 ICSynth 46
5.1.2 ARChem, RouteDesigner and ChemPlanner 47
5.1.3 Chematica 48
5.1.4 Work by Segler, Waller and Preuss 49
5.1.5 Mining Electronic Laboratory Notebooks 50
5.1.6 RASA 51
5.1.7 Use of a Neural Network by Nam and Kim 52
5.1.8 RetroPath 52
5.2 Reducing Hazardous Impurities in Pharmaceuticals 53
5.3 Knowledge-based Systems for Synthetic Accessibility 53
5.3.1 SPROUT, HIPPO and CAESA 53
5.3.2 AllChem 54
5.3.3 RECAP 54
5.3.4 DOGS 54
5.3.5 Reactor 55
5.3.6 Work by Schürer et al. 55
5.3.7 SAVI 55
5.3.8 ROBIA 56
5.4 Other Systems for Synthetic Accessibility and Reaction Prediction 56
5.4.1 SYLVIA and Work by Boda et al. 56
5.4.2 SYNOPSIS 57
5.4.3 IADE 57
5.4.4 Using Neural Networks 58
5.4.5 Work by Fukushini et al. 59
5.4.6 Reaction Predictor 59
5.4.7 Work by Hristozov et al. 59
5.4.8 Work by Segler and Waller 59
References 60
Chapter 6 Structure Representation 64
6.1 Wiswesser Line-formula Notation 64
6.2 SMILES, SMARTS and SMIRKS 66
6.3 SYBYL Line Notation (SLN) 68
6.4 CHMTRN and PATRAN 69
6.5 ALCHEM 75
6.6 Molfiles, SDfiles and RDfiles 75
6.7 Mol2 Files 76
6.8 The Standard Molecular Data Format and Molecular Information File 77
6.9 Chemical Markup Language and CMLReact 77
6.10 CDX and CDXML 77
6.11 Molecular Query Language (MQL) 78
6.12 CSRML 79
6.13 Using Pictures 80
References 80
Chapter 7 Structure, Substructure and Superstructure Searching 84
7.1 Exact Structure Searching 84
7.1.1 Canonical SMILES Codes 85
7.1.2 Morgan Names and SEMA Names 88
7.1.3 MOLGEN-CID 92
7.1.4 The Method Described by Henrickson and Toczko 93
7.1.5 InChI Code 94
7.1.6 CACTVS Hash Codes 95
7.2 Atom by Atom Matching 96
7.3 Substructure Searching 98
7.4 Set Reduction 100
7.5 Superstructure and Markush Structure Searching 104
7.6 Reaction Searching 105
7.7 Searching for Structures in Wikipedia 105
References 106
Chapter 8 Protons That Come and Go 108
8.1 Dealing with Tautomerism 108
8.2 Implicit and Explicit Hydrogen Atoms 111
References 115
Chapter 9 Aromaticity and Stereochemistry 116
9.1 Aromaticity 116
9.2 Stereochemistry 119
9.2.1 Tetrahedral Centres 119
9.2.2 Double Bonds 122
9.2.3 Other Kinds of Asymmetry 124
References 124
Chapter 10 DEREK - Predicting Toxicity 125
10.1 How DEREK Came About 125
10.2 The Alert-based Approach to Toxicity Prediction in DEREK 128
References 133
Chapter 11 Other Alert-based Toxicity Prediction Systems 134
11.1 TOX-MATCH and PHARM-MATCH 134
11.2 Oncologic 136
11.3 HazardExpert 138
11.4 BfR/BgVV System 139
11.5 ToxTree and Toxmatch 139
11.6 Leadscope Genetox Expert Alerts 140
11.7 Environmental Toxicity Prediction 140
References 141
Chapter 12 Rule Discovery 143
12.1 QSAR 143
12.2 TopKat 144
12.3 Multicase 145
12.4 Lazar 146
12.5 Sarah 147
12.6 Emerging Pattern Mining 147
12.7 Other Fragment-based Systems 149
12.7.1 REX 149
12.7.2 Using Atom-centred Fragments 151
12.8 Other Approaches in the Field of Toxicity Prediction 151
12.9 Discovering Reaction Rules 152
References 154
Chapter 13 The 2D-3D Debate 158
References 165
Chapter 14 Making Use of Reasoning: Derek for Windows 167
14.1 Moving on from Just Recognising Alerts in Structures 167
14.2 The Logic of Argumentation 169
14.3 Choosing Levels of Likelihood for a System Based on LA 176
14.4 Derek for Windows and Derek Nexus 178
14.5 The Derek Knowledge Editor 183
14.6 Making Improvements in the Light of Experience 187
References 192
Chapter 15 Predicting Metabolism 194
15.1 Predicting Primary Sites of Metabolism 196
15.1.1 COMPACT 196
15.1.2 MetaSite and Mass-MetaSite 197
15.1.3 SPORCalc and MetaPrint2D 197
15.1.4 SMARTCyp 197
15.1.5 FAME 198
15.2 Predicting Metabolic Trees 198
15.2.1 MetabolExpert 199
15.2.2 META 199
15.2.3 TIMES 200
15.2.4 Meteor 201
References 207
Chapter 16 Relative Reasoning 211
References 220
Chapter 17 Predicting Biodegradation 221
17.1 BESS 222
17.2 CATABOL 223
17.3 The UMBBD, PPS and Mepps 223
17.4 EnviPath 226
17.5 CRAFT 229
17.6 META 229
17.7 The Future for Prediction of Environmental Degradation 229
References 230
Chapter 18 Other Applications and Potential Applications of Knowledge-based Prediction in Chemistry 233
18.1 The Maillard Reaction 233
18.2 Recording Information about Useful Biological Activity 234
18.3 Proposing Structural Analogues for Drug Design 235
18.4 Predicting Product Degradation During Storage 235
18.5 Designing Production Synthesis Routes 236
18.6 Using Knowledge-based Systems for Teaching 237
References 238
Chapter 19 Combining Predictions 239
19.1 Introduction 239
19.2 The ICH M7 Guidelines 242
19.3 Giving Access to Multiple Models in a Single Package 243
19.3.1 The OECD (Q)SAR Toolbox 243
19.3.2 Prediction of Aquatic Toxicity by Gerrit Schüürmann’s Group 244
19.3.3 Leadscope Model Applier 245
19.3.4 eTOX and iPiE 245
19.3.5 Meteor and SMARTCyp 246
19.3.6 The NoMiracle Project - Mira 246
19.3.7 Eco-Derek 248
19.3.8 Derek and Sarah 249
19.3.9 Combining Predictions Using Dempster-Shafer Theory 249
19.4 Looking Ahead 249
References 251
Chapter 20 The Adverse Outcome Pathways Approach 253
References 257
Chapter 21 Evaluation of Knowledge-based Systems 258
21.1 The OECD (Q)SAR Guidelines 258
21.2 Defining Applicability Domain 259
21.3 Using Traditional Measures of Predictive Performance 261
21.4 A Different Way to Evaluate Predictive Performance 264
References 267
Chapter 22 Validation of Computer Predictions 269
References 272
Chapter 23 Artificial Intelligence Developments in Other Fields 273
References 274
Chapter 24 A Subjective View of the Future 276
References 278
Subject Index 279