Remove a paragraph that was moved to tip 1 in #241

Benjamin-Lee · Oct 18, 2020 · 66167b8 · 66167b8
1 parent bc0cb73
commit 66167b8
Showing 1 changed file with 0 additions and 6 deletions.
diff --git a/content/04.baselines.md b/content/04.baselines.md
@@ -7,12 +7,6 @@ Such models include logistic regression, random forests, k-nearest neighbors, na
 While performance baselines available from existing literature can also serve as helpful guides, an implementation of a simpler model that uses the same software framework as planned for deep learning can greatly help with assessing the correctness of data processing steps, performance evaluation pipelines, resource requirement estimates, and computational performance estimates.
 Furthermore, in some cases, it can even be useful to combine simpler baseline models with deep neural networks, as such hybrid models can improve generalization performance, model interpretability, and confidence estimation [@arxiv:1803.04765; @arxiv:1805.11783].
 
-However, it is important to gauge the relative effectiveness of baseline and deep learning models by comparing them with established tools (for example, bioinformatics pipelines or image analysis workflows), as conventional methods (machine learning based or not) can potentially perform equivalently to or better than newer machine or deep learning methods.
-While this seems unintuitive, it can in fact be the case when the available data are of limited size and/or atypical in nature.
-For example, Rajkomar et al. [@doi:10.1038/s41746-018-0029-1] found that simpler baseline models achieved performance comparable with that of deep learning in a number of clinical prediction tasks using electronic health records.
-Another example is provided by Koutsoukas et al., who benchmarked several traditional machine learning approaches against deep neural networks for modeling bioactivity data on moderately sized datasets [@doi:10.1186/s13321-017-0226-y].
-The researchers found that while well tuned deep learning approaches generally tend to outperform conventional classifiers, simpler conventional methods such as Naive Bayes classification tend to outperform deep learning as the noise in the dataset increases.
-
 Another potential pitfall arises from comparing the performance of baseline conventional models trained with default settings with the performance of deep learning models that have undergone rigorous tuning and optimization.
 Since conventional off-the-shelf machine learning algorithms (for example, support vector machines and random forests) are also likely to benefit from hyperparameter tuning, such incongruity prevents the comparison of equally optimized models and can lead to false conclusions about model efficacy.
 Hu and Greene [@doi:10.1142/9789813279827_0033] discuss this under the umbrella of what they call the "Continental Breakfast Included" effect, and they describe how the unequal tuning of hyperparameters across different learning algorithms can especially skew evaluation when the performance of an algorithm varies substantially with modest changes to its hyperparameters.