diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 07a502ff8..f0d6f37ec 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -24,7 +24,7 @@ All commits in a pull request will be squashed to a single commit with the origi
## Uploading datasets
-* Only datasets that allowed for public use for all purposes (including redistribution) can be uploaded to this repository.
+* Only datasets that allowed for public use for all purposes (including redistribution) can be uploaded to this repository.
* To avoid the repository growing too large that it's not convenient to work with, the limit for an uploaded dataset file is 5 MB. Everything that is bigger should be downloaded programmatically on the first run of the app.
-* All datasets should be stored in [datasets](https://github.com/dotnet/machinelearning-samples/tree/master/datasets) folder to allow reusing them by other examples.
+* All datasets should be stored in [datasets](https://github.com/dotnet/machinelearning-samples/tree/main/datasets) folder to allow reusing them by other examples.
* If you are uploading a dataset, please add a section in datasets [README](datasets/README.md) file describing the original source and license.
diff --git a/README.md b/README.md
index 8596a48ea..1ad41b337 100644
--- a/README.md
+++ b/README.md
@@ -2,9 +2,9 @@
# ML.NET Samples
-[ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) is a cross-platform open-source machine learning framework that makes machine learning accessible to .NET developers.
+[ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) is a cross-platform open-source machine learning framework that makes machine learning accessible to .NET developers.
-In this GitHub repo, we provide samples which will help you get started with ML.NET and how to infuse ML into existing and new .NET apps.
+In this GitHub repo, we provide samples which will help you get started with ML.NET and how to infuse ML into existing and new .NET apps.
**Note:** Please open issues related to [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) framework in the [Machine Learning repository](https://github.com/dotnet/machinelearning/issues). Please create the issue in this repo only if you face issues with the samples in this repository.
@@ -16,7 +16,7 @@ There are two types of samples/apps in the repo:
The official ML.NET samples are divided in multiple categories depending on the scenario and machine learning problem/task, accessible through the following tables:
-
+
Binary classification |
@@ -24,12 +24,12 @@ The official ML.NET samples are divided in multiple categories depending on the

 Sentiment Analysis C# F# |

 Spam Detection C# F# |

 Credit Card Fraud Detection (Binary Classification) C# F# |
-
+

 Heart Disease Prediction C# |
|
|
-
+
Multi-class classification |
@@ -70,7 +70,7 @@ The official ML.NET samples are divided in multiple categories depending on the
C#

 Power Anomaly Detection C# |

 Credit Card Fraud Detection (Anomaly Detection) C# |
-
+
Clustering |
@@ -97,12 +97,12 @@ The official ML.NET samples are divided in multiple categories depending on the
 Image Classification Predictions (Pretrained TensorFlow model scoring)
C# F#    
C# |
 Image Classification Training (TensorFlow Featurizer Estimator)
C# F# |
-
+
 Object Detection (ONNX model scoring)
C#
C# |
-
+
@@ -131,7 +131,7 @@ The official ML.NET samples are divided in multiple categories depending on the
# Automate ML.NET models generation (Preview state)
-The previous samples show you how to use the ML.NET API 1.0 (GA since May 2019).
+The previous samples show you how to use the ML.NET API 1.0 (GA since May 2019).
However, we're also working on simplifying ML.NET usage with additional technologies that automate the creation of the model for you so you don't need to write the code by yourself to train a model, you simply need to provide your datasets. The "best" model and the code for running it will be generated for you.
@@ -166,7 +166,7 @@ ML.NET AutoML API is basically a set of libraries packaged as a NuGet package yo
# Additional ML.NET Community Samples
In addition to the ML.NET samples provided by Microsoft, we're also highlighting samples created by the community showcased in this separated page:
-[ML.NET Community Samples](https://github.com/dotnet/machinelearning-samples/blob/master/docs/COMMUNITY-SAMPLES.md)
+[ML.NET Community Samples](https://github.com/dotnet/machinelearning-samples/blob/main/docs/COMMUNITY-SAMPLES.md)
Those Community Samples are not maintained by Microsoft but by their owners.
If you have created any cool ML.NET sample, please, add its info into this [REQUEST issue](https://github.com/dotnet/machinelearning-samples/issues/86) and we'll publish its information in the mentioned page, eventually.
diff --git a/docs/DATASETS.md b/docs/DATASETS.md
index 089dc8284..b12548ee0 100644
--- a/docs/DATASETS.md
+++ b/docs/DATASETS.md
@@ -7,21 +7,21 @@ The datasets are provided under the original terms that .NET FOUNDATION received
| Dataset name | Original Dataset | Processed Dataset | Sample using the Dataset | Approval Status |
|-----------------|------------------|--------------------|----------------------------------------|--------|
-| Wikipedia Detox | [Original](https://meta.wikimedia.org/wiki/Research:Detox/Data_Release) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_SentimentAnalysis/datasets) | [BinaryClassification_SentimentAnalysis](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_SentimentAnalysis) | APPROVED |
-| Credit Card Fraud Detection | [Original](https://www.kaggle.com/mlg-ulb/creditcardfraud) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_CreditCardFraudDetection/CreditCardFraudDetection.Trainer/assets/input) | [BinaryClassification_CreditCardFraudDetection](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_CreditCardFraudDetection) | APPROVED |
-| Corefx Issues | [Original](https://github.com/dotnet/corefx/issues) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler/GitHubLabeler/Data) | [github-labeler](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/github-labeler) | PENDING |
-| Iris flower data set | [Original](https://en.wikipedia.org/wiki/Iris_flower_data_set#Use_of_the_data_set) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/datasets) | [MulticlassClassification_Iris](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MulticlassClassification_Iris) | APPROVED |
-| Iris flower data set | [Original](https://en.wikipedia.org/wiki/Iris_flower_data_set#Use_of_the_data_set) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/datasets) | [Clustering_Iris](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Clustering_Iris) | APPROVED |
-| TLC Trip Record Data | [Original](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/datasets) | [Regression_TaxiFarePrediction](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Regression_TaxiFarePrediction) | APPROVED |
-| Online Retail Data Set | [Original](http://archive.ics.uci.edu/ml/datasets/online+retail) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/eShopDashboardML/src/eShopForecastModelsTrainer/data) | [eShopDashboardML](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/eShopDashboardML) | APPROVED |
+| Wikipedia Detox | [Original](https://meta.wikimedia.org/wiki/Research:Detox/Data_Release) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_SentimentAnalysis/datasets) | [BinaryClassification_SentimentAnalysis](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_SentimentAnalysis) | APPROVED |
+| Credit Card Fraud Detection | [Original](https://www.kaggle.com/mlg-ulb/creditcardfraud) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_CreditCardFraudDetection/CreditCardFraudDetection.Trainer/assets/input) | [BinaryClassification_CreditCardFraudDetection](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_CreditCardFraudDetection) | APPROVED |
+| Corefx Issues | [Original](https://github.com/dotnet/corefx/issues) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler/GitHubLabeler/Data) | [github-labeler](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/github-labeler) | PENDING |
+| Iris flower data set | [Original](https://en.wikipedia.org/wiki/Iris_flower_data_set#Use_of_the_data_set) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/datasets) | [MulticlassClassification_Iris](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/MulticlassClassification_Iris) | APPROVED |
+| Iris flower data set | [Original](https://en.wikipedia.org/wiki/Iris_flower_data_set#Use_of_the_data_set) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/datasets) | [Clustering_Iris](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Clustering_Iris) | APPROVED |
+| TLC Trip Record Data | [Original](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/datasets) | [Regression_TaxiFarePrediction](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Regression_TaxiFarePrediction) | APPROVED |
+| Online Retail Data Set | [Original](http://archive.ics.uci.edu/ml/datasets/online+retail) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/eShopDashboardML/src/eShopForecastModelsTrainer/data) | [eShopDashboardML](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/eShopDashboardML) | APPROVED |
| Online Retail Data Set | [Original](http://archive.ics.uci.edu/ml/datasets/online+retail) | [Processed](http://TBD) | [Product recommender](http://TBD) | APPROVED |
-| Bike Sharing Dataset | [Original](https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Regression_BikeSharingDemand/BikeSharingDemandConsoleApp/data) | [Regression_BikeSharingDemand](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Regression_BikeSharingDemand) | APPROVED |
+| Bike Sharing Dataset | [Original](https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Regression_BikeSharingDemand/BikeSharingDemandConsoleApp/data) | [Regression_BikeSharingDemand](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Regression_BikeSharingDemand) | APPROVED |
| TBD Movies-Dataset | [Original](http://TBD) | [Processed](http://TBD) | [Movie recommender](http://TBD) | PENDING |
-| WineKMC | [Original](https://media.wiley.com/product_ancillary/6X/11186614/DOWNLOAD/ch02.zip) [Related Download 1](http://blog.yhat.com/static/misc/data/WineKMC.xlsx ) [Related Info](http://blog.yhat.com/posts/customer-segmentation-using-python.html) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Clustering_CustomerSegmentation/CustomerSegmentation.Train/assets/inputs) | [Clustering_CustomerSegmentation](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/Clustering_CustomerSegmentation) | APPROVED |
+| WineKMC | [Original](https://media.wiley.com/product_ancillary/6X/11186614/DOWNLOAD/ch02.zip) [Related Download 1](http://blog.yhat.com/static/misc/data/WineKMC.xlsx ) [Related Info](http://blog.yhat.com/posts/customer-segmentation-using-python.html) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Clustering_CustomerSegmentation/CustomerSegmentation.Train/assets/inputs) | [Clustering_CustomerSegmentation](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/Clustering_CustomerSegmentation) | APPROVED |
| WikiMedia photos | [Original](https://commons.wikimedia.org/wiki/Category:Images) | [Processed](https://github.com/CESARDELATORRE/MLNETTensorFlowScoringv06API/tree/features/dynamicApi/src/ImageClassification/assets/inputs/images) | [MLNETTensorFlowScoringv06API ](https://github.com/CESARDELATORRE/MLNETTensorFlowScoringv06API) | APPROVED |
| SMS Spam Collection Data Set | [Original](https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection) | [Processed](http://TBD) | [Spam Filter TBD](http://TBD) | PENDING until de-identify, cleaned-up |
-| Heart Disease Data Set | [Original](https://archive.ics.uci.edu/ml/datasets/Heart+Disease) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_HeartDiseaseDetection/HeartDiseaseDetection/Data) | [Heart Disease Detection](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/BinaryClassification_HeartDiseaseDetection) | Approved |
-| Product Sales Data Set | sample data created | [Processed](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/AnomalyDetection_Sales/SpikeDetection/Data) | [Sales Spike Detection](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/AnomalyDetection_Sales) | Approved |
+| Heart Disease Data Set | [Original](https://archive.ics.uci.edu/ml/datasets/Heart+Disease) | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_HeartDiseaseDetection/HeartDiseaseDetection/Data) | [Heart Disease Detection](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/BinaryClassification_HeartDiseaseDetection) | Approved |
+| Product Sales Data Set | sample data created | [Processed](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/AnomalyDetection_Sales/SpikeDetection/Data) | [Sales Spike Detection](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/AnomalyDetection_Sales) | Approved |
The datasets are provided under the original terms that Microsoft received such datasets. See below for more information about each dataset.
@@ -78,7 +78,7 @@ The datasets are provided under the original terms that Microsoft received such
>Redistributing the "Processed Dataset" datasets with attribution:
>
> Original source: https://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
->
+>
> The dataset is provided under terms provided by City of New York: https://opendata.cityofnewyork.us/overview/#termsofuse.
>
>By accessing datasets and feeds available through NYC Open Data, the user agrees to all of the Terms of Use of NYC.gov as well as the Privacy Policy for NYC.gov. The user also agrees to any additional terms of use defined by the agencies, bureaus, and offices providing data. Public data sets made available on NYC Open Data are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on NYC Open Data, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein.
@@ -143,13 +143,13 @@ The datasets are provided under the original terms that Microsoft received such
>
>institution = "University of California, Irvine, School of Information and Computer Sciences" }
>
->A few data sets have additional citation requests. These requests can be found on the bottom of each data set's web page.
+>A few data sets have additional citation requests. These requests can be found on the bottom of each data set's web page.
>
>https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset, in turn, contains the following notice:
>
>Citation Request:
>
->Fanaee-T, Hadi, and Gama, Joao, 'Event labeling combining ensemble detectors and background knowledge', Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg,
+>Fanaee-T, Hadi, and Gama, Joao, 'Event labeling combining ensemble detectors and background knowledge', Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg,
### WineKMC dataset
@@ -163,7 +163,7 @@ The datasets are provided under the original terms that Microsoft received such
> Copyright © 2000-2018 by John Wiley & Sons, Inc., or related companies. All rights reserved.
>
> https://media.wiley.com/product_ancillary/6X/11186614/DOWNLOAD/ch02.zip
->
+>
### WikiMedia photos
@@ -173,7 +173,7 @@ The datasets are provided under the original terms that Microsoft received such
> Original source: https://commons.wikimedia.org/wiki/Category:Images
>
> Specific licenses per images used in samples
-> https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/getting-started/DeepLearning_ImageClassification_TensorFlow/ImageClassification/assets/inputs/images/wikimedia.md
+> https://github.com/dotnet/machinelearning-samples/blob/main/samples/csharp/getting-started/DeepLearning_ImageClassification_TensorFlow/ImageClassification/assets/inputs/images/wikimedia.md
>
@@ -202,14 +202,14 @@ The datasets are provided under the original terms that Microsoft received such
>
>Relevant Papers:
>
->We offer a comprehensive study of this corpus in the following paper. This work presents a number of statistics, studies and baseline results for several machine learning methods.
+>We offer a comprehensive study of this corpus in the following paper. This work presents a number of statistics, studies and baseline results for several machine learning methods.
>
>Almeida, T.A., Gómez Hidalgo, J.M., Yamakami, A. Contributions to the Study of SMS Spam Filtering: New Collection and Results. Proceedings of the 2011 ACM Symposium on Document Engineering (DOCENG'11), Mountain View, CA, USA, 2011.
>
>Citation Request:
>
->If you find this dataset useful, you make a reference to our paper and the web page.[http://www.dt.fee.unicamp.br/~tiago/smsspamcollection] in your papers, research, etc;
-Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make use of the corpus.
+>If you find this dataset useful, you make a reference to our paper and the web page.[http://www.dt.fee.unicamp.br/~tiago/smsspamcollection] in your papers, research, etc;
+Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make use of the corpus.
>
>We would like to thank Min-Yen Kan and his team for making the NUS SMS Corpus available.
>
@@ -225,7 +225,7 @@ Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make us
>
>If you publish material based on databases obtained from this repository, then, in your acknowledgements, please note the assistance you received by using this repository. This will help others to obtain the same data sets and replicate your experiments. We suggest the following pseudo-APA reference format for referring to this repository:
>
-> Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
+> Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
>
> Here is a BiBTeX citation as well:
>
@@ -239,7 +239,7 @@ Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make us
>
> url = "http://archive.ics.uci.edu/ml",
>
-> institution = "University of California, Irvine, School of Information and Computer Sciences" }
+> institution = "University of California, Irvine, School of Information and Computer Sciences" }
>
>A few data sets have additional citation requests. These requests can be found on the bottom of each data set's web page.
>
@@ -247,11 +247,11 @@ Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make us
>
>Citation Request:
>
->The authors of the databases have requested that any publications resulting from the use of the data include the names of the principal investigator responsible for the data collection at each institution. They would be:
+>The authors of the databases have requested that any publications resulting from the use of the data include the names of the principal investigator responsible for the data collection at each institution. They would be:
>
->1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
->2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
->3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
+>1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
+>2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
+>3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
>4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.
>
@@ -270,7 +270,7 @@ Send us a message to talmeida ufscar.br or jmgomezh yahoo.es in case you make us
>| ... | .... |
>
>
->The Product Sales dataset is based on the dataset “Shampoo Sales Over a Three Year Period” originally sourced from DataMarket and >provided by Time Series Data Library (TSDL), created by Rob Hyndman.
+>The Product Sales dataset is based on the dataset “Shampoo Sales Over a Three Year Period” originally sourced from DataMarket and >provided by Time Series Data Library (TSDL), created by Rob Hyndman.
>
>“Shampoo Sales Over a Three Year Period” Dataset Licensed Under the DataMarket Default Open License:
>
diff --git a/docs/Samples-Backlog.md b/docs/Samples-Backlog.md
index 6c68c89a2..eefa5bc67 100644
--- a/docs/Samples-Backlog.md
+++ b/docs/Samples-Backlog.md
@@ -19,7 +19,7 @@ Refactoring of Samples: Here are the list of backlog items created on MachineLea
| 15 | Add a link in ReadMe file for the LargeDataset sample once icon is finalized | p1 | done | Prathyusha Korrapati(v-prkor)
| 16 | Use latest version of Inception 5h model in TensorFlow samples | p2 | | Prathyusha Korrapati(v-prkor)
| 17 | Create Image Classification/Clustering Sample w/ DnnImageFeaturizerTransform(check this issue https://github.com/dotnet/machinelearning-samples/issues/225) | p2 | | Prathyusha Korrapati(v-prkor)
-| 18 | Add a sample on Timeseries for sales Forecast to use a trainer based on Time Series instead of Regression - check this SAMPLE https://github.com/dotnet/machinelearning/blob/master/docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/TimeSeries/Forecasting.cs | p0 | | TBD
+| 18 | Add a sample on Timeseries for sales Forecast to use a trainer based on Time Series instead of Regression - check this SAMPLE https://github.com/dotnet/machinelearning/blob/main/docs/samples/Microsoft.ML.Samples/Dynamic/Transforms/TimeSeries/Forecasting.cs | p0 | | TBD
| 19 | Migrate all the samples to ML.Net 1.1 | p0 | done | Prathyusha Korrapati(v-prkor)
| 20 | Create an UWP app to detection objects in camera -->Check the performance of ML.Net with Live streaming of images -->In future convert this UWP app to WPF as UWP does not support in production env| p0 | | Prathyusha Korrapati(v-prkor)
| 22 | Custom training the model with your own images instead of pretrained model and pretrained images: By using Cusrom Vision pretrained model you can train your own images and create a new model either a. TensorFlow model .pb or b. Onnx Model -->Use this model to predict the images. | p2 | | Prathyusha Korrapati(v-prkor)
diff --git a/samples/CLI/MulticlassClassification_CLI/README.md b/samples/CLI/MulticlassClassification_CLI/README.md
index d1575f0a8..1cc8b576a 100644
--- a/samples/CLI/MulticlassClassification_CLI/README.md
+++ b/samples/CLI/MulticlassClassification_CLI/README.md
@@ -1,14 +1,14 @@
# Auto-generate model training and C# code for a Multi-class Classification task (GitHub Issues classification scenario)
-In this example you are going to automatically train/create a model and related C# code by simply providing a dataset (The GitHub .NET Framework issues dataset in this case) to the ML.NET CLI tool.
+In this example you are going to automatically train/create a model and related C# code by simply providing a dataset (The GitHub .NET Framework issues dataset in this case) to the ML.NET CLI tool.
+
+*Note:* This CLI example is related to the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) but in this case the C# code is auto-generated by the CLI tool. You don't need to start coding in C# from scratch.
-*Note:* This CLI example is related to the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) but in this case the C# code is auto-generated by the CLI tool. You don't need to start coding in C# from scratch.
-
## What is the ML.NET CLI (Command-line Interface)
*The ML.NET **CLI** (command-line interface) is a tool you run on any command-prompt (Windows, Mac or Linux) for generating good quality ML.NET models and C# source code based on training datasets you provide.*
-The ML.NET CLI is part of ML.NET and its main purpose is to "democratize" ML.NET for .NET developers when learning ML.NET so it is very simple to generate a good quality ML.NET model (serialized model .zip file) plus the sample C# code to run/score that model. In addition, the C# code to create/train that model is also generated for you so you can research what algorithm and settings it is using for that generated "best model".
+The ML.NET CLI is part of ML.NET and its main purpose is to "democratize" ML.NET for .NET developers when learning ML.NET so it is very simple to generate a good quality ML.NET model (serialized model .zip file) plus the sample C# code to run/score that model. In addition, the C# code to create/train that model is also generated for you so you can research what algorithm and settings it is using for that generated "best model".
## Run the CLI command to generate the ML model and C# code for the GitHub .NET Framework issues dataset
@@ -31,7 +31,7 @@ You will get a similar command execution like the following:
This process is performing multiple training explorations trying multiple trainers/algorithms and multiple hyper-parameters with different combinations of configuration per each model.
-**IMPORTANT:** Note that in this case you are exploring multiple trainings with the CLI looking for "best models" only for 5 minutes. That's enough when you are just learning the CLI usage and the generated C# code for the model. But when trying to optimize the model to achieve high quality you might need to run the CLI 'auto-train' command for many more minutes or even hours depending on the size of the dataset.
+**IMPORTANT:** Note that in this case you are exploring multiple trainings with the CLI looking for "best models" only for 5 minutes. That's enough when you are just learning the CLI usage and the generated C# code for the model. But when trying to optimize the model to achieve high quality you might need to run the CLI 'auto-train' command for many more minutes or even hours depending on the size of the dataset.
As a rule of thumb, a high quality model might need hundreds of iterations (hundreds of models explored automatically performed by the CLI).
@@ -43,7 +43,7 @@ For undestanding the **'quality metrics'** read this doc: [Model evaluation metr
That command **generates** the following assets in a **new folder** (if no --name parameter was specified, its name is **'SampleMulticlassClassification'**):
-- A serialized **"best model"** (MLModel.zip) ready to use.
+- A serialized **"best model"** (MLModel.zip) ready to use.
- Sample **C# code** to **run/score** that generated model (To make predictions in your end-user apps with that model).
- Sample **C# code** with the **training code** used to generate that model (For learning purposes or direct training with the API).
@@ -51,9 +51,9 @@ The first two assets (.ZIP file model and C# code to run that model) can directl
The third asset, the training code, shows you what ML.NET API code was used by the CLI to train the generated model, so you can investigate what specific trainer/algorithm and hyper-paramenters were selected by the CLI.
-Go ahead and explore that generated C# projects code and compare it with the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) in this repo. The accuracy and performance coming from the model generated by the CLI should be better than the sample in the repo which has simpler ML.NET code with no additional hyper-parameters, etc.
+Go ahead and explore that generated C# projects code and compare it with the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) in this repo. The accuracy and performance coming from the model generated by the CLI should be better than the sample in the repo which has simpler ML.NET code with no additional hyper-parameters, etc.
-For instance, the configuration for one of the trainers used in the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) (`SdcaMaximumEntropy`) is simplified for making easier to learn ML.NET (but might not be the most optimal model), so it is like the following code, with no hyper-parameters:
+For instance, the configuration for one of the trainers used in the [GitHub issues classification ML.NET sample](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler) (`SdcaMaximumEntropy`) is simplified for making easier to learn ML.NET (but might not be the most optimal model), so it is like the following code, with no hyper-parameters:
```csharp
@@ -64,21 +64,21 @@ On the other hand, in **1 hour** exploration time with the CLI, the selected alg
```csharp
var trainer = mlContext.MulticlassClassification.Trainers.LightGbm(new LightGbmMulticlassTrainer.Options()
- { NumberOfIterations = 150,
- LearningRate = 0.1254156f,
- NumberOfLeaves = 9,
- MinimumExampleCountPerLeaf = 20,
- UseCategoricalSplit = false,
- HandleMissingValue = false,
- MinimumExampleCountPerGroup = 100,
- MaximumCategoricalSplitPointCount = 64,
- CategoricalSmoothing = 20,
- L2CategoricalRegularization = 0.1,
- UseSoftmax = true,
- Booster = new GradientBooster.Options()
+ { NumberOfIterations = 150,
+ LearningRate = 0.1254156f,
+ NumberOfLeaves = 9,
+ MinimumExampleCountPerLeaf = 20,
+ UseCategoricalSplit = false,
+ HandleMissingValue = false,
+ MinimumExampleCountPerGroup = 100,
+ MaximumCategoricalSplitPointCount = 64,
+ CategoricalSmoothing = 20,
+ L2CategoricalRegularization = 0.1,
+ UseSoftmax = true,
+ Booster = new GradientBooster.Options()
{ L2Regularization = 0.5,
- L1Regularization = 1 },
- LabelColumnName = "Area",
+ L1Regularization = 1 },
+ LabelColumnName = "Area",
FeatureColumnName = "Features" })
```
@@ -86,7 +86,7 @@ If you run the CLI for longer time exploring additional algorithms/trainers, the
Finding those hyper-parameters by yourself could be a very long and tedious trial process. With the CLI and AutoML this is very much simplified for you.
-# Next steps: Use your own dataset for creating models for your own scenarios
+# Next steps: Use your own dataset for creating models for your own scenarios
You can generate those assets explained above from your own datasets without coding by yourself, so it also improves your productivity even if you already know ML.NET. Try your own dataset with the CLI!
diff --git a/samples/csharp/common/ConsoleHelper.cs b/samples/csharp/common/ConsoleHelper.cs
index 518670d67..6326f567f 100644
--- a/samples/csharp/common/ConsoleHelper.cs
+++ b/samples/csharp/common/ConsoleHelper.cs
@@ -79,7 +79,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -187,11 +187,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -230,7 +230,7 @@ public static void PeekVectorColumnDataInConsole(MLContext mlContext, string col
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine();
diff --git a/samples/csharp/end-to-end-apps/DeepLearning_ImageClassification_TF/README.md b/samples/csharp/end-to-end-apps/DeepLearning_ImageClassification_TF/README.md
index f99aa3f0c..b9759d98d 100644
--- a/samples/csharp/end-to-end-apps/DeepLearning_ImageClassification_TF/README.md
+++ b/samples/csharp/end-to-end-apps/DeepLearning_ImageClassification_TF/README.md
@@ -6,7 +6,7 @@
## Problem
-The problem is how to run/score a TensorFlow model in a web app/service while using in-memory images.
+The problem is how to run/score a TensorFlow model in a web app/service while using in-memory images.
## Solution
The model (`model.pb`) is trained using TensorFlow as disscussed in the blogpost [Run with ML.NET C# code a TensorFlow model exported from Azure Cognitive Services Custom Vision](https://devblogs.microsoft.com/cesardelatorre/run-with-ml-net-c-code-a-tensorflow-model-exported-from-azure-cognitive-services-custom-vision/).
@@ -16,7 +16,7 @@ See the below architecture that shows how to run/score TensorFlow model in ASP.N

-The difference between the [getting started sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_TensorFlow) and this end-to-end sample is that the images are loaded from **file** in getting started sample whereas the images are loaded from **in-memory** in this end-to-end sample.
+The difference between the [getting started sample](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/DeepLearning_ImageClassification_TensorFlow) and this end-to-end sample is that the images are loaded from **file** in getting started sample whereas the images are loaded from **in-memory** in this end-to-end sample.
**Note:** this sample is trained using Custom images and it predicts the only specific images that are in [TestImages](./TestImages) Folder.
diff --git a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction.Explainability/ConsoleHelper.cs b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction.Explainability/ConsoleHelper.cs
index 990d2b468..0acb7d800 100644
--- a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction.Explainability/ConsoleHelper.cs
+++ b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction.Explainability/ConsoleHelper.cs
@@ -68,7 +68,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -174,11 +174,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -194,7 +194,7 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
Console.WriteLine(lineToPrint + "\n");
}
}
-
+
public static List PeekVectorColumnDataInConsole(MLContext mlContext, string columnName, IDataView dataView, IEstimator pipeline, int numberOfRows = 4)
{
string msg = string.Format("Peek data in DataView: : Show {0} rows with just the '{1}' column", numberOfRows, columnName );
@@ -212,7 +212,7 @@ public static List PeekVectorColumnDataInConsole(MLContext mlContext, s
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine(concatColumn);
});
diff --git a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/TaxiFarePredictionConsoleApp/ConsoleHelper.cs b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/TaxiFarePredictionConsoleApp/ConsoleHelper.cs
index 990d2b468..0acb7d800 100644
--- a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/TaxiFarePredictionConsoleApp/ConsoleHelper.cs
+++ b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/TaxiFarePredictionConsoleApp/ConsoleHelper.cs
@@ -68,7 +68,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -174,11 +174,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -194,7 +194,7 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
Console.WriteLine(lineToPrint + "\n");
}
}
-
+
public static List PeekVectorColumnDataInConsole(MLContext mlContext, string columnName, IDataView dataView, IEstimator pipeline, int numberOfRows = 4)
{
string msg = string.Format("Peek data in DataView: : Show {0} rows with just the '{1}' column", numberOfRows, columnName );
@@ -212,7 +212,7 @@ public static List PeekVectorColumnDataInConsole(MLContext mlContext, s
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine(concatColumn);
});
diff --git a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/common/ConsoleHelper.cs b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/common/ConsoleHelper.cs
index 990d2b468..0acb7d800 100644
--- a/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/common/ConsoleHelper.cs
+++ b/samples/csharp/end-to-end-apps/Model-Explainability/TaxiFarePrediction/common/ConsoleHelper.cs
@@ -68,7 +68,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -174,11 +174,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -194,7 +194,7 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
Console.WriteLine(lineToPrint + "\n");
}
}
-
+
public static List PeekVectorColumnDataInConsole(MLContext mlContext, string columnName, IDataView dataView, IEstimator pipeline, int numberOfRows = 4)
{
string msg = string.Format("Peek data in DataView: : Show {0} rows with just the '{1}' column", numberOfRows, columnName );
@@ -212,7 +212,7 @@ public static List PeekVectorColumnDataInConsole(MLContext mlContext, s
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine(concatColumn);
});
diff --git a/samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md b/samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md
index 5af52a057..45dce1295 100644
--- a/samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md
+++ b/samples/csharp/end-to-end-apps/ObjectDetection-Onnx/README.md
@@ -37,7 +37,7 @@ This sample defaults to use the pre-trained Tiny YOLOv2 model described above.
### To use your own model, use the following steps
-1. [Create and train](https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/get-started-build-detector) an object detector with the Custom Vision. To export the model, make sure to select a **compact** domain such as **General (compact)**. To export an existing object detector, convert the domain to compact by selecting the gear icon at the top right. In _**Settings**_, choose a compact model, save, and train your project.
+1. [Create and train](https://docs.microsoft.com/en-us/azure/cognitive-services/custom-vision-service/get-started-build-detector) an object detector with the Custom Vision. To export the model, make sure to select a **compact** domain such as **General (compact)**. To export an existing object detector, convert the domain to compact by selecting the gear icon at the top right. In _**Settings**_, choose a compact model, save, and train your project.
2. [Export your model](https://docs.microsoft.com/azure/cognitive-services/custom-vision-service/export-your-model) by going to the _**Performance**_ tab. Select an iteration trained with a compact domain, an "Export" button will appear. Select _Export_, _ONNX_, _ONNX1.2_, and then _Export_. Once the file is ready, select the *Download* button.
3. The export will a zip file containing several files, including some sample code, a list of labels, and the ONNX model. Drop the .zip file into the [**OnnxModels**](./OnnxObjectDetection/ML/OnnxModels) folder in the [OnnxObjectDetection](./OnnxObjectDetection) project.
4. In Solutions Explorer, right-click the [OnnxModels](./OnnxObjectDetection/ML/OnnxModels) folder and select _Add Existing Item_. Select the .zip file you just added.
@@ -135,7 +135,7 @@ _Note, if the model were trained to detect a different number of classes this va
## Code Walkthrough
-_This sample differs from the [getting-started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) in that here we load/process the images **in-memory** whereas the getting-started sample loads the images from a **file**._
+_This sample differs from the [getting-started object detection sample](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/DeepLearning_ObjectDetection_Onnx) in that here we load/process the images **in-memory** whereas the getting-started sample loads the images from a **file**._
Create a class that defines the data schema to use while loading data into an `IDataView`. ML.NET supports the `Bitmap` type for images, so we'll specify `Bitmap` property decorated with the `ImageTypeAttribute` and pass in the height and width dimensions we got by [inspecting the model](#model-input-and-output), as shown below.
@@ -187,7 +187,7 @@ var model = pipeline.Fit(dataView);
## Load model and create PredictionEngine
-After the model is configured, we need to save the model, load the saved model, create a `PredictionEngine`, and then pass the image to the engine to detect objects using the model. This is one place that the **Web** app and the **WPF** app differ slightly.
+After the model is configured, we need to save the model, load the saved model, create a `PredictionEngine`, and then pass the image to the engine to detect objects using the model. This is one place that the **Web** app and the **WPF** app differ slightly.
The **Web** app uses a `PredictionEnginePool` to efficiently manage and provide the service with a `PredictionEngine` to use to make predictions. Internally, it is optimized so the object dependencies are cached and shared across Http requests with minimum overhead when creating those objects.
@@ -271,10 +271,10 @@ When deploying this application on Azure via App Service, you may encounter some
1. One reason why you may get a 5xx code after deploying the application is the platform. The web application only runs on 64-bit architectures. In Azure, change the **Platform** setting in the your respective App Service located in the **Settings > Configuration > General Settings** menu.
- 1. Another reason for a 5xx code after deploying the application is the target framework for the web application is .NET Core 3.0, which is currently in preview. You can either revert the application and the referenced project to .NET Core 2.x or add an extension to your App Service.
+ 1. Another reason for a 5xx code after deploying the application is the target framework for the web application is .NET Core 3.0, which is currently in preview. You can either revert the application and the referenced project to .NET Core 2.x or add an extension to your App Service.
- - To add .NET Core 3.0 support in the Azure Portal, select the **Add** button in the **Development Tools > Extensions** section of your respective App Service.
- - Then, select **Choose Extension** and select **ASP.NET Core 3.0 (x64) Runtime** from the list of extensions and accept the Legal Terms to proceed with adding the extension to your App Service.
+ - To add .NET Core 3.0 support in the Azure Portal, select the **Add** button in the **Development Tools > Extensions** section of your respective App Service.
+ - Then, select **Choose Extension** and select **ASP.NET Core 3.0 (x64) Runtime** from the list of extensions and accept the Legal Terms to proceed with adding the extension to your App Service.
1. Relative paths
diff --git a/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/README.md b/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/README.md
index 104cc1a8f..7ccc57aff 100644
--- a/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/README.md
+++ b/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/README.md
@@ -1,47 +1,47 @@
-# Movie Recommender
+# Movie Recommender
| ML.NET version | API type | Status | App Type | Data sources | Scenario | ML Task | Algorithms |
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
|v1.4 | Dynamic API | up-to-date | End-End app | .csv | Movie Recommendation | Recommendation | Field Aware Factorization Machines |
-
+
## Overview
-MovieRecommender is a simple application which both builds and consumes a recommendation model.
+MovieRecommender is a simple application which both builds and consumes a recommendation model.
-This is an end-end sample on how you can enhance your existing ASP.NET apps with recommendations.
+This is an end-end sample on how you can enhance your existing ASP.NET apps with recommendations.
-The sample takes insipiration from the popular Netflix application and even though this sample focuses on movie recommendations, learnings can be easily applied to any style of product recommendations.
+The sample takes insipiration from the popular Netflix application and even though this sample focuses on movie recommendations, learnings can be easily applied to any style of product recommendations.
## Features
-* Wep app
- * This is an end-end ASP.NET app which presents three user profiles 'Ankit', 'Cesar', 'Amy'. It then provides these three users
- recommendations using a ML.NET recommendation model.
+* Wep app
+ * This is an end-end ASP.NET app which presents three user profiles 'Ankit', 'Cesar', 'Amy'. It then provides these three users
+ recommendations using a ML.NET recommendation model.
-* Recommendation Model
- * The application builds a recommendation model using the MovieLens dataset. The model training code shows
- uses collaborative filtering based recommendation approach.
+* Recommendation Model
+ * The application builds a recommendation model using the MovieLens dataset. The model training code shows
+ uses collaborative filtering based recommendation approach.
## How does it work?
-## Model Training
+## Model Training
-Movie Recommender uses Collaborative Filtering for recommendations.
+Movie Recommender uses Collaborative Filtering for recommendations.
-The underlying assumption with Collaborative filtering is that if a person A (e.g. Amy) has the same opinion as a person B (e.g. Cesar) on an issue, A (Amy) is more likely to have B’s (Cesar) opinion on a different issue than that of a random person.
+The underlying assumption with Collaborative filtering is that if a person A (e.g. Amy) has the same opinion as a person B (e.g. Cesar) on an issue, A (Amy) is more likely to have B’s (Cesar) opinion on a different issue than that of a random person.
-For this sample we make use of the http://files.grouplens.org/datasets/movielens/ml-latest-small.zip dataset.
+For this sample we make use of the http://files.grouplens.org/datasets/movielens/ml-latest-small.zip dataset.
-The model training code can be found in the [MovieRecommender_Model](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender_Model).
+The model training code can be found in the [MovieRecommender_Model](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender_Model).
-Model training follows the following four steps for building the model. You can traverse the code and follow along.
+Model training follows the following four steps for building the model. You can traverse the code and follow along.
-
+
## Model Consumption
-The trained model is consumed in the [Controller](https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender/movierecommender/Controllers/MoviesController.cs#L60) using the following steps.
+The trained model is consumed in the [Controller](https://github.com/dotnet/machinelearning-samples/blob/main/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender/movierecommender/Controllers/MoviesController.cs#L60) using the following steps.
### 1. Create the ML.NET environment and load the already trained model
@@ -49,19 +49,19 @@ The trained model is consumed in the [Controller](https://github.com/dotnet/mach
// 1. Create the ML.NET environment and load the already trained model
MLContext mlContext = new MLContext();
-
+
ITransformer trainedModel;
using (var stream = new FileStream(_movieService.GetModelPath(), FileMode.Open, FileAccess.Read, FileShare.Read))
{
trainedModel = mlContext.Model.Load(stream);
}
```
-### 2. Create a prediction function to predict a set of movie recommendations
+### 2. Create a prediction function to predict a set of movie recommendations
```CSharp
//2. Create a prediction function
var predictionEngine = mlContext.Model.CreatePredictionEngine(trainedModel);
-
+
List<(int movieId, float normalizedScore)> ratings = new List<(int movieId, float normalizedScore)>();
var MovieRatings = _profileService.GetProfileWatchedMovies(id);
List WatchedMovies = new List();
@@ -70,9 +70,9 @@ The trained model is consumed in the [Controller](https://github.com/dotnet/mach
{
WatchedMovies.Add(_movieService.Get(movieId));
}
-
+
MovieRatingPrediction prediction = null;
-
+
foreach (var movie in _movieService.GetTrendingMovies)
{
//Call the Rating Prediction for each movie prediction
@@ -97,14 +97,14 @@ The trained model is consumed in the [Controller](https://github.com/dotnet/mach
return View(activeprofile);
```
-## Alternate Approaches
-This sample shows one of many recommendation approaches that can be used with ML.NET. Depending upon your specific scenario you can choose any of the following approaches which best fit your usecase.
+## Alternate Approaches
+This sample shows one of many recommendation approaches that can be used with ML.NET. Depending upon your specific scenario you can choose any of the following approaches which best fit your usecase.
| Scenario | Algorithm | Link To Sample
-| --- | --- | --- |
-| You want to use attributes (features) like UserId, ProductId, Ratings, Product Description, Product Price etc. for your recommendation engine. In such a scenario Field Aware Factorization Machine is a generalized approach you can use to build your recommendation engine | Field Aware Factorization Machines | This sample |
-| You have UserId, ProductId and Ratings available to you for what users bought and rated. For this scenario you should use the Matrix Factorization approach | Matrix Factorization | [Matrix Factorization - Recommendation](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation)|
-| You only have UserId and ProductId's the user bought available to you but not ratings. This is common in datasets from online stores where you might only have access to purchase history for your customers. With this style of recommendation you can build a recommendation engine which recommends frequently bought items. | One Class Matrix Factorization | [Product Recommender](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation) |
+| --- | --- | --- |
+| You want to use attributes (features) like UserId, ProductId, Ratings, Product Description, Product Price etc. for your recommendation engine. In such a scenario Field Aware Factorization Machine is a generalized approach you can use to build your recommendation engine | Field Aware Factorization Machines | This sample |
+| You have UserId, ProductId and Ratings available to you for what users bought and rated. For this scenario you should use the Matrix Factorization approach | Matrix Factorization | [Matrix Factorization - Recommendation](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation)|
+| You only have UserId and ProductId's the user bought available to you but not ratings. This is common in datasets from online stores where you might only have access to purchase history for your customers. With this style of recommendation you can build a recommendation engine which recommends frequently bought items. | One Class Matrix Factorization | [Product Recommender](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation) |
diff --git a/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-Custom/README.md b/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-Custom/README.md
index b97759ab9..bb04b7de6 100644
--- a/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-Custom/README.md
+++ b/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-Custom/README.md
@@ -10,7 +10,7 @@
**IMPORTANT NOTE: This sample uses an older approach by implementing all the 'plumbing' related to the PredictionEngine Object Pool. This custom implementation is no longer required since the release of the PredictionEnginePool API provided since May 2019.
Check this other sample for the preferred and much simpler approach:**
-https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-IntegrationPkg
+https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/ScalableMLModelOnWebAPI-IntegrationPkg
---
@@ -31,7 +31,7 @@ For a much more detailed explanation, including design diagrams, read the follow
SamplePrediction prediction = _modelEngine.Predict(sampleData);
```
-As simple as a single line. The object _modelEngine will be injected in the controller's constructor or into your custom class.
+As simple as a single line. The object _modelEngine will be injected in the controller's constructor or into your custom class.
Internally, it is optimized so the object dependencies are cached and shared across Http requests with minimum overhead when creating those objects.
@@ -41,7 +41,7 @@ The problem running/scoring an ML.NET model in multi-threaded applications comes
# Solution
-## Use Object Pooling for PredictionEngine objects
+## Use Object Pooling for PredictionEngine objects
Since a PredictionEngine object cannot be singleton because it is not 'thread safe', a good solution for being able to have 'ready to use' PredictionEngine objects is to use an object pooling-based approach.
@@ -49,6 +49,6 @@ When it is necessary to work with a number of objects that are particularly expe
An object pool design pattern can be very effective in such cases.
-The [object pool pattern](https://en.wikipedia.org/wiki/Object_pool_pattern) is a design pattern that uses a set of initialized objects kept ready to use (a 'pool') rather than allocating and destroying them on demand.
+The [object pool pattern](https://en.wikipedia.org/wiki/Object_pool_pattern) is a design pattern that uses a set of initialized objects kept ready to use (a 'pool') rather than allocating and destroying them on demand.
This solution's implementation is based on a higher-level custom class (named **MLModelEngine**) which is instantiated as singleton and creates the needed infrastructure for such an object pool solution.
diff --git a/samples/csharp/end-to-end-apps/Unity-HelloMLNET/README.md b/samples/csharp/end-to-end-apps/Unity-HelloMLNET/README.md
index b6f2b392e..5ee791b20 100644
--- a/samples/csharp/end-to-end-apps/Unity-HelloMLNET/README.md
+++ b/samples/csharp/end-to-end-apps/Unity-HelloMLNET/README.md
@@ -1,14 +1,14 @@
-# Loading ML.NET Models in Unity
+# Loading ML.NET Models in Unity
## Overview
This Unity Package shows you how to load a ML.NET model into your Unity Game using a simple form based UI application.
-
+
## Features
-* Unity Package File
- * Simple Unity Package File which has a simple form based UI component which predicts Toxicity of input sentiments using a ML.NET model
+* Unity Package File
+ * Simple Unity Package File which has a simple form based UI component which predicts Toxicity of input sentiments using a ML.NET model
## Known Workarounds
* Create a plugins folder in assets and add core ML.NET Nuget binaries along with all nested
diff --git a/samples/csharp/getting-started/DatabaseIntegration/README.md b/samples/csharp/getting-started/DatabaseIntegration/README.md
index d584f369d..d89d0c03a 100644
--- a/samples/csharp/getting-started/DatabaseIntegration/README.md
+++ b/samples/csharp/getting-started/DatabaseIntegration/README.md
@@ -1,12 +1,12 @@
# Using LoadFromEnumerable and Entity Framework with a relational database as a data source for training and validating a model
This sample demonstrates how to use a database as a data source for an ML.NET pipeline by using an IEnumerable. Since a database is treated as any other datasource, it is possible to query the database and use the resulting data for training and prediction scenarios.
-**Update (Sept. 2nd 2019): If you want to load data from a relational database, there's a simpler approach in ML.NET by using the DatabaseLoader. Check the [DatabaseLoader sample](/samples/csharp/getting-started/DatabaseLoader)**.
+**Update (Sept. 2nd 2019): If you want to load data from a relational database, there's a simpler approach in ML.NET by using the DatabaseLoader. Check the [DatabaseLoader sample](/samples/csharp/getting-started/DatabaseLoader)**.
Note that you could also implement a similar aproach using **LoadFromEnumerable** but using a **No-SQL** database or any other data source instead a relational database. However, this example is using a relational database being accessed by Entity Framework.
## Problem
-Enterprise users have a need to use their existing data set that is in their company's database to train and predict with ML.NET.
+Enterprise users have a need to use their existing data set that is in their company's database to train and predict with ML.NET.
Even when in most cases data needs to be clean-up and prepared before training a machine learning model, many enterprises are very familiar with databases for transforming and preparing data and prefer to have centralized and secured data into database servers instead of working with exported plain text files.
@@ -16,7 +16,7 @@ Note that the process for preparing your dataset is out of scope for this sample
### Why Data preparation is important
-Why can't you simply create a join query against your transational tables? - Even when tecnically you could create the IEnumerable from any join query, in most real-world situations that won't work for the ML algorithms/trainers.
+Why can't you simply create a join query against your transational tables? - Even when tecnically you could create the IEnumerable from any join query, in most real-world situations that won't work for the ML algorithms/trainers.
Data preparation is important because most machine learning trainers/algorithms require data to be formatted in a very specific way or input feature columns to be in very specific data types, so datasets generally require some data preparation before you can really train a model with it. You also need to clean-up data, some data sources might have missing values (null/nan), or invalid values (data might need to be in a different scale, you might need to upsample or normalize numeric values in features, etc.) making the training process to either break of to produce a less accurate result or even misleading.
@@ -30,15 +30,15 @@ https://machinelearningmastery.com/how-to-prepare-data-for-machine-learning/
This sample shows how to use Entity Framework Core to connect to a database, query and feed the resulting data into an ML.NET pipeline through an IEnumerable.
-This sample uses SQLite to help demonstrate the database integration, but any database (such as SQL Server, Oracle, MySQL, PostgreSQL, etc.) that is supported by Entity Framwork Core can be used. As ML.NET can create an IDataView from an IEnumerable, this sample will use the IEnumerable that is returned from a query to feed the data into the ML.NET pipeline.
+This sample uses SQLite to help demonstrate the database integration, but any database (such as SQL Server, Oracle, MySQL, PostgreSQL, etc.) that is supported by Entity Framwork Core can be used. As ML.NET can create an IDataView from an IEnumerable, this sample will use the IEnumerable that is returned from a query to feed the data into the ML.NET pipeline.
## Important considerations and workarounds
-1. To prevent the Entity Framework Core from loading all the data in from a result, **a no tracking query is used**.
+1. To prevent the Entity Framework Core from loading all the data in from a result, **a no tracking query is used**.
2. It is important to highlight that **the IEnumerable you provide needs to be thread-safe**. This example shows you to create an IEnumerable with Entity Framework that won’t cause issues to LoadFromEnumerable() because it makes sure that each new enumeration of the data happens on a separate DbContext and DbConnection by basically creating a database context in your code each time a IEnumerable is requested.
-Specifically, the code showing you how to create a database context each time a IEnumerable is requested plus using a 'no tracking query' is here: https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/getting-started/DatabaseIntegration/DatabaseIntegration/Program.cs#L44
+Specifically, the code showing you how to create a database context each time a IEnumerable is requested plus using a 'no tracking query' is here: https://github.com/dotnet/machinelearning-samples/blob/main/samples/csharp/getting-started/DatabaseIntegration/DatabaseIntegration/Program.cs#L44
## High level process performed by this sample
@@ -48,7 +48,7 @@ The sample implements the following:
- Creates and populates the database
- Query database for the dataset
- Converts the IEnumerable to IDataView
-- Trains a LightGBM Binary Classification model
+- Trains a LightGBM Binary Classification model
- Queries the database for a test dataset
- Runs predictions
- Evaluates the prediction metrics
diff --git a/samples/csharp/getting-started/DatabaseLoader/DatabaseLoaderConsoleApp/Common/ConsoleHelper.cs b/samples/csharp/getting-started/DatabaseLoader/DatabaseLoaderConsoleApp/Common/ConsoleHelper.cs
index 2039c6f3a..db7139242 100644
--- a/samples/csharp/getting-started/DatabaseLoader/DatabaseLoaderConsoleApp/Common/ConsoleHelper.cs
+++ b/samples/csharp/getting-started/DatabaseLoader/DatabaseLoaderConsoleApp/Common/ConsoleHelper.cs
@@ -69,7 +69,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -177,11 +177,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -217,7 +217,7 @@ public static void PeekVectorColumnDataInConsole(MLContext mlContext, string col
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine(concatColumn);
});
diff --git a/samples/csharp/getting-started/DatabaseLoader/README.md b/samples/csharp/getting-started/DatabaseLoader/README.md
index 3c721f7b7..d5de8ad57 100644
--- a/samples/csharp/getting-started/DatabaseLoader/README.md
+++ b/samples/csharp/getting-started/DatabaseLoader/README.md
@@ -11,11 +11,11 @@ This sample shows you how you can use the native database loader ro directly tra
## Problem
-In the enterprise and many organizations in general, data is organized and stored as relational databases to be used by enterprise applications. Many of those organizations also prepare their ML model training/evaluation data in relational databases which is also where the new data is being collected and prepared. Therefore, many of those users would also like to directly train/evaluate ML models directly agaist that data stored in relational databases.
+In the enterprise and many organizations in general, data is organized and stored as relational databases to be used by enterprise applications. Many of those organizations also prepare their ML model training/evaluation data in relational databases which is also where the new data is being collected and prepared. Therefore, many of those users would also like to directly train/evaluate ML models directly agaist that data stored in relational databases.
## Background
-In previous [ML.NET](https://dot.net/ml) releases, since [ML.NET](https://dot.net/ml) 1.0, you could also train against a relational database by providing data through an IEnumerable collection by using the [LoadFromEnumerable()](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.dataoperationscatalog.loadfromenumerable?view=ml-dotnet) API where the data could be coming from a relational database or any other source. However, when using that approach, you as a developer are responsible for the code reading from the relational database (such as using Entity Framework or any other approach) which needs to be implemented properly so you are streaming data while training the ML model, as in this [previous sample using LoadFromEnumerable()](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DatabaseIntegration).
+In previous [ML.NET](https://dot.net/ml) releases, since [ML.NET](https://dot.net/ml) 1.0, you could also train against a relational database by providing data through an IEnumerable collection by using the [LoadFromEnumerable()](https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.dataoperationscatalog.loadfromenumerable?view=ml-dotnet) API where the data could be coming from a relational database or any other source. However, when using that approach, you as a developer are responsible for the code reading from the relational database (such as using Entity Framework or any other approach) which needs to be implemented properly so you are streaming data while training the ML model, as in this [previous sample using LoadFromEnumerable()](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/DatabaseIntegration).
## Solution
@@ -27,8 +27,8 @@ Here’s example code on how easily you can now configure your code to load data
var mlContext = new MLContext();
-// The following is a connection string using a localdb SQL database,
-// but you can also use connection strings against on-premises SQL Server, Azure SQL Database
+// The following is a connection string using a localdb SQL database,
+// but you can also use connection strings against on-premises SQL Server, Azure SQL Database
// or any other relational database (Oracle, SQLite, PostgreSQL, MySQL, Progress, IBM DB2, etc.)
// localdb SQL database connection string using a filepath to attach the database file into localdb
@@ -38,11 +38,11 @@ string connectionString = $"Data Source = (LocalDB)\\MSSQLLocalDB;AttachDbFilena
string commandText = "SELECT * from URLClicks";
DatabaseLoader loader = mlContext.Data.CreateDatabaseLoader();
-
-DatabaseSource dbSource = new DatabaseSource(SqlClientFactory.Instance,
- connectionString,
+
+DatabaseSource dbSource = new DatabaseSource(SqlClientFactory.Instance,
+ connectionString,
commandText);
-
+
IDataView dataView = loader.Load(dbSource);
// From this point you can use the IDataView for training and validating an ML.NET model as in any other sample
diff --git a/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/ImageClassification.Train/Common/ConsoleHelper.cs b/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/ImageClassification.Train/Common/ConsoleHelper.cs
index fb8f8b930..28999f876 100644
--- a/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/ImageClassification.Train/Common/ConsoleHelper.cs
+++ b/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/ImageClassification.Train/Common/ConsoleHelper.cs
@@ -83,7 +83,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
}
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -191,11 +191,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var transformedData = transformer.Transform(dataView);
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewTransformedData = transformedData.Preview(maxRows: numberOfRows);
@@ -234,7 +234,7 @@ public static void PeekVectorColumnDataInConsole(MLContext mlContext, string col
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine();
diff --git a/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/README.md b/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/README.md
index 12f3b6b6e..455354694 100644
--- a/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/README.md
+++ b/samples/csharp/getting-started/DeepLearning_ImageClassification_Training/README.md
@@ -4,12 +4,12 @@
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
| Microsoft.ML 1.5.0 | Dynamic API | Up-to-date | Console apps and Web App | Image files | Image classification | Image classification with TensorFlow model retrain based on transfer learning | DNN architectures: ResNet, InceptionV3, MobileNet, etc. |
-## Problem
+## Problem
Image classification is a common problem within the Deep Learning subject. This sample shows how to create your own custom image classifier by training your model based on the transfer learning approach which is basically retraining a pre-trained model (architecture such as InceptionV3 or ResNet) so you get a custom model trained on your own images.
In this sample app you create your own custom image classifier model by natively training a TensorFlow model from ML.NET API with your own images.
-*Image classifier scenario – Train your own custom deep learning model with ML.NET*
+*Image classifier scenario – Train your own custom deep learning model with ML.NET*

@@ -17,7 +17,7 @@ In this sample app you create your own custom image classifier model by natively
> *Image set license*
>
-> This sample's dataset is based on the 'flower_photos imageset' available from Tensorflow at [this URL](http://download.tensorflow.org/example_images/flower_photos.tgz).
+> This sample's dataset is based on the 'flower_photos imageset' available from Tensorflow at [this URL](http://download.tensorflow.org/example_images/flower_photos.tgz).
> All images in this archive are licensed under the Creative Commons By-Attribution License, available at:
https://creativecommons.org/licenses/by/2.0/
>
@@ -25,7 +25,7 @@ https://creativecommons.org/licenses/by/2.0/
The by default imageset being downloaded by the sample has 200 images evenly distributed across 5 flower classes:
- Images --> flower_photos_small_set -->
+ Images --> flower_photos_small_set -->
|
daisy
|
@@ -37,7 +37,7 @@ The by default imageset being downloaded by the sample has 200 images evenly dis
|
tulips
-The name of each sub-folder is important because that'll be the name of each class/label the model is going to use to classify the images.
+The name of each sub-folder is important because that'll be the name of each class/label the model is going to use to classify the images.
## ML Task - Image Classification
@@ -64,7 +64,7 @@ Sample references screenshot in training project using **CPU**:
When using **GPU**, your project has to reference the following redist library (*and remove the CPU version reference*):
-- `SciSharp.TensorFlow.Redist-Windows-GPU` (GPU training on Windows)
+- `SciSharp.TensorFlow.Redist-Windows-GPU` (GPU training on Windows)
- `SciSharp.TensorFlow.Redist-Linux-GPU` (GPU training on Linux)
@@ -78,7 +78,7 @@ Building the model includes the following steps:
* Loading the image files (file paths in this case) into an IDataView
* Image classification using the ImageClassification estimator (high level API)
-Define the schema of data in a class type and refer that type while loading the images from the files folder.
+Define the schema of data in a class type and refer that type while loading the images from the files folder.
```csharp
public class ImageData
@@ -130,7 +130,7 @@ IDataView shuffledFullImageFilePathsDataset = mlContext.Data.ShuffleRows(fullIma
Once it's loaded into the IDataView, the rows are shuffled so the dataset is better balanced before spliting into the training/test datasets.
Now, this next step is very important. Since we want the ML model to work with in-memory images, we need to load the images into the dataset and actually do it by calling fit() and transform().
-This step needs to be done in a initial and seggregated pipeline in the first place so the filepaths won't be used by the pipeline and model to create when training.
+This step needs to be done in a initial and seggregated pipeline in the first place so the filepaths won't be used by the pipeline and model to create when training.
```csharp
// 3. Load Images with in-memory type within the IDataView and Transform Labels to Keys (Categorical)
@@ -181,15 +181,15 @@ There’s another overloaded method for advanced users where you can also specif
The following is how you use the advanced DNN parameters:
-```csharp
+```csharp
// 5.1 (OPTIONAL) Define the model's training pipeline by using explicit hyper-parameters
var options = new ImageClassificationTrainer.Options()
{
FeatureColumnName = "Image",
LabelColumnName = "LabelAsKey",
- // Just by changing/selecting InceptionV3/MobilenetV2/ResnetV250
- // you can try a different DNN architecture (TensorFlow pre-trained model).
+ // Just by changing/selecting InceptionV3/MobilenetV2/ResnetV250
+ // you can try a different DNN architecture (TensorFlow pre-trained model).
Arch = ImageClassificationTrainer.Architecture.MobilenetV2,
Epoch = 50, //100
BatchSize = 10,
@@ -207,14 +207,14 @@ var pipeline = mlContext.MulticlassClassification.Trainers.ImageClassification(o
### 3. Train model
In order to begin the training process you run `Fit` on the built pipeline:
-```csharp
+```csharp
// 4. Train/create the ML model
ITransformer trainedModel = pipeline.Fit(trainDataView);
```
### 4. Evaluate model
-After the training, we evaluate the model's quality by using the test dataset.
+After the training, we evaluate the model's quality by using the test dataset.
The `Evaluate` function needs an `IDataView` with the predictions generated from the test dataset by calling Transfor().
@@ -242,9 +242,9 @@ You should proceed as follows in order to train a model your model:
#### GPU vs. CPU for consuming/scoring the model
-When consuming/scoring the model you can also choose between CPU/GPU, however, if using GPU you also need to make sure that the machine/server running the model supports a GPU.
+When consuming/scoring the model you can also choose between CPU/GPU, however, if using GPU you also need to make sure that the machine/server running the model supports a GPU.
-The way you set up the scoring/consumption project to use GPU is the same way explained at the begining of this readme.md by simply using one or the other redist library.
+The way you set up the scoring/consumption project to use GPU is the same way explained at the begining of this readme.md by simply using one or the other redist library.
#### Sample Console app for scoring
@@ -286,14 +286,14 @@ float maxScore = prediction.Score.Max();
Console.WriteLine($"Image Filename : [{imageToPredict.ImageFileName}], " +
$"Predicted Label : [{prediction.PredictedLabel}], " +
- $"Probability : [{maxScore}] "
+ $"Probability : [{maxScore}] "
);
```
-The prediction engine receives as parameter an object of type `InMemoryImageData` (containing 2 properties: `Image` and `ImageFileName`).
+The prediction engine receives as parameter an object of type `InMemoryImageData` (containing 2 properties: `Image` and `ImageFileName`).
The ImageFileName is not used byt the model. You simple have it there so you can print the filename out out when showing the prediction. The prediction is only using the image's bits in the `byte[] Image` field.
-Then the model returns and object of type `ImagePrediction`, which holds the `PredictedLabel` and all the `Scores` for all the classes/types of images.
+Then the model returns and object of type `ImagePrediction`, which holds the `PredictedLabel` and all the `Scores` for all the classes/types of images.
Since the `PredictedLabel` is already a string it'll be shown in the console.
About the score for the predicted label, we just need to take the highest score which is the probability for the predicted label.
@@ -307,7 +307,7 @@ You should proceed as follows in order to train a model your model:
#### Sample ASP.NET Core web app for scoring/inference
-In the sample's solution there's another project named *ImageClassification.WebApp* which is an ASP.NET Core web app that allows the user to submit an image through HTTP and score/predict with that in-memory image.
+In the sample's solution there's another project named *ImageClassification.WebApp* which is an ASP.NET Core web app that allows the user to submit an image through HTTP and score/predict with that in-memory image.
This sample also uses the `PredictionEnginePool` which is recommended for multi-threaded and scalable applications.
@@ -318,7 +318,7 @@ Below you can see an screenshot of the app:
# TensorFlow DNN Transfer Learning background information
-This sample app is retraining a TensorFlow model for image classification. As a user, you could think it is pretty similar to this other sample [Image classifier using the TensorFlow Estimator featurizer](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_TensorFlowEstimator). However, the internal implementation is very different under the covers. In that mentioned sample, it is using a 'model composition approach' where an initial TensorFlow model (i.e. InceptionV3 or ResNet) is only used to featurize the images and produce the binary information per image to be used by another ML.NET classifier trainer added on top (such as `LbfgsMaximumEntropy`). Therefore, even when that sample is using a TensorFlow model, you are training only with a ML.NET trainer, you don't retrain a new TensorFlow model but train an ML.NET model. That's why the output of that sample is only an ML.NET model (.zip file).
+This sample app is retraining a TensorFlow model for image classification. As a user, you could think it is pretty similar to this other sample [Image classifier using the TensorFlow Estimator featurizer](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/DeepLearning_TensorFlowEstimator). However, the internal implementation is very different under the covers. In that mentioned sample, it is using a 'model composition approach' where an initial TensorFlow model (i.e. InceptionV3 or ResNet) is only used to featurize the images and produce the binary information per image to be used by another ML.NET classifier trainer added on top (such as `LbfgsMaximumEntropy`). Therefore, even when that sample is using a TensorFlow model, you are training only with a ML.NET trainer, you don't retrain a new TensorFlow model but train an ML.NET model. That's why the output of that sample is only an ML.NET model (.zip file).
In contrast, this sample is natively retraining a new TensorFlow model based on a Transfer Learning approach but training a new TensorFlow model derived from the specified pre-trained model (Inception V3 or ResNet).
@@ -330,7 +330,7 @@ In the screenshot below you can see how you can see that retrained TensorFlow mo

-**Benefits:**
+**Benefits:**
- **Train and inference using GPU:**
When using this native DNN approach based on TensorFlow you can either use the CPU or GPU (if available) for a better performance (less time needed for training and scoring).
diff --git a/samples/csharp/getting-started/LargeDatasets/LargeDatasets/Common/ConsoleHelper.cs b/samples/csharp/getting-started/LargeDatasets/LargeDatasets/Common/ConsoleHelper.cs
index 76db01b45..6e1a21f51 100644
--- a/samples/csharp/getting-started/LargeDatasets/LargeDatasets/Common/ConsoleHelper.cs
+++ b/samples/csharp/getting-started/LargeDatasets/LargeDatasets/Common/ConsoleHelper.cs
@@ -68,7 +68,7 @@ public static void PrintMultiClassClassificationMetrics(string name, MulticlassC
Console.WriteLine($" LogLoss for class 3 = {metrics.PerClassLogLoss[2]:0.####}, the closer to 0, the better");
Console.WriteLine($"************************************************************");
}
-
+
public static void PrintRegressionFoldsAverageMetrics(string algorithmName, IReadOnlyList> crossValidationResults)
{
var L1 = crossValidationResults.Select(r => r.Metrics.MeanAbsoluteError);
@@ -174,11 +174,11 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
string msg = string.Format("Peek data in DataView: Showing {0} rows with the columns", numberOfRows.ToString());
ConsoleWriteHeader(msg);
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
var transformer = pipeline.Fit(dataView);
var preparedData = transformer.Transform(dataView);
- // 'preparedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'preparedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
var preViewpreparedData = preparedData.Preview(maxRows: numberOfRows);
@@ -194,7 +194,7 @@ public static void PeekDataViewInConsole(MLContext mlContext, IDataView dataView
Console.WriteLine(lineToPrint + "\n");
}
}
-
+
public static List PeekVectorColumnDataInConsole(MLContext mlContext, string columnName, IDataView dataView, IEstimator pipeline, int numberOfRows = 4)
{
string msg = string.Format("Peek data in DataView: : Show {0} rows with just the '{1}' column", numberOfRows, columnName );
@@ -212,7 +212,7 @@ public static List PeekVectorColumnDataInConsole(MLContext mlContext, s
String concatColumn = String.Empty;
foreach (float f in row)
{
- concatColumn += f.ToString();
+ concatColumn += f.ToString();
}
Console.WriteLine(concatColumn);
});
diff --git a/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation/README.md b/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation/README.md
index 561b6d5c1..d9540f225 100644
--- a/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation/README.md
+++ b/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation/README.md
@@ -4,21 +4,21 @@
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
| Microsoft.ML.Recommender Preview v0.16.0 | Dynamic API | Up-to-date | Console app | .csv files | Recommendation | Matrix Factorization | MatrixFactorizationTrainer|
-In this sample, you can see how to use ML.NET to build a movie recommendation engine.
+In this sample, you can see how to use ML.NET to build a movie recommendation engine.
## Problem
-For this tutorial we will use the MovieLens dataset which comes with movie ratings, titles, genres and more. In terms of an approach for building our movie recommendation engine we will use Factorization Machines which uses a collaborative filtering approach.
+For this tutorial we will use the MovieLens dataset which comes with movie ratings, titles, genres and more. In terms of an approach for building our movie recommendation engine we will use Factorization Machines which uses a collaborative filtering approach.
-‘Collaborative filtering’ operates under the underlying assumption that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.
+‘Collaborative filtering’ operates under the underlying assumption that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.
-With ML.NET we support the following three recommendation scenarios, depending upon your scenario you can pick either of the three from the list below.
+With ML.NET we support the following three recommendation scenarios, depending upon your scenario you can pick either of the three from the list below.
| Scenario | Algorithm | Link To Sample
-| --- | --- | --- |
-| You have UserId, ProductId and Ratings available to you for what users bought and rated.| Matrix Factorization | This sample |
-| You only have UserId and ProductId's the user bought available to you but not ratings. This is common in datasets from online stores where you might only have access to purchase history for your customers. With this style of recommendation you can build a recommendation engine which recommends frequently bought items. | One Class Matrix Factorization | [Product Recommender](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation) |
-| You want to use more attributes (features) beyond UserId, ProductId and Ratings like Product Description, Product Price etc. for your recommendation engine | Field Aware Factorization Machines | [Movie Recommender with Factorization Machines](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender_Model) |
+| --- | --- | --- |
+| You have UserId, ProductId and Ratings available to you for what users bought and rated.| Matrix Factorization | This sample |
+| You only have UserId and ProductId's the user bought available to you but not ratings. This is common in datasets from online stores where you might only have access to purchase history for your customers. With this style of recommendation you can build a recommendation engine which recommends frequently bought items. | One Class Matrix Factorization | [Product Recommender](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation) |
+| You want to use more attributes (features) beyond UserId, ProductId and Ratings like Product Description, Product Price etc. for your recommendation engine | Field Aware Factorization Machines | [Movie Recommender with Factorization Machines](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/end-to-end-apps/Recommendation-MovieRecommender/MovieRecommender_Model) |
## DataSet
@@ -27,7 +27,7 @@ http://files.grouplens.org/datasets/movielens/ml-latest-small.zip
## Algorithm - [Matrix Factorization (Recommendation)](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks#recommendation)
-The algorithm for this recommendation task is Matrix Factorization, which is a supervised machine learning algorithm performing collaborative filtering.
+The algorithm for this recommendation task is Matrix Factorization, which is a supervised machine learning algorithm performing collaborative filtering.
## Solution
@@ -37,29 +37,29 @@ To solve this problem, you build and train an ML model on existing training data
### 1. Build model
-Building a model includes:
+Building a model includes:
* Define the data's schema mapped to the datasets to read (`recommendation-ratings-train.csv` and `recommendation-ratings-test.csv`) with a Textloader
* Matrix Factorization requires the two features userId, movieId to be encoded
-* Matrix Factorization trainer then takes these two encoded features (userId, movieId) as input
+* Matrix Factorization trainer then takes these two encoded features (userId, movieId) as input
Here's the code which will be used to build the model:
```CSharp
-
- //STEP 1: Create MLContext to be shared across the model creation workflow objects
+
+ //STEP 1: Create MLContext to be shared across the model creation workflow objects
MLContext mlcontext = new MLContext();
- //STEP 2: Read the training data which will be used to train the movie recommendation model
+ //STEP 2: Read the training data which will be used to train the movie recommendation model
//The schema for training data is defined by type 'TInput' in LoadFromTextFile() method.
IDataView trainingDataView = mlcontext.Data.LoadFromTextFile(TrainingDataLocation, hasHeader: true, ar:',');
-//STEP 3: Transform your data by encoding the two features userId and movieID. These encoded features will be provided as
+//STEP 3: Transform your data by encoding the two features userId and movieID. These encoded features will be provided as
// to our MatrixFactorizationTrainer.
var dataProcessingPipeline = mlcontext.Transforms.Conversion.MapValueToKey(outputColumnName: userIdEncoded, inputColumnName: eRating.userId))
.Append(mlcontext.Transforms.Conversion.MapValueToKey(outputColumnName: movieIdEncoded, inputColumnName: nameofg.movieId)));
-
+
//Specify the options for MatrixFactorization trainer
MatrixFactorizationTrainer.Options options = new MatrixFactorizationTrainer.Options();
options.MatrixColumnIndexColumnName = userIdEncoded;
@@ -68,37 +68,37 @@ Here's the code which will be used to build the model:
options.NumberOfIterations = 20;
options.ApproximationRank = 100;
-//STEP 4: Create the training pipeline
+//STEP 4: Create the training pipeline
var trainingPipeLine = dataProcessingPipeline.Append(mlcontext.Recommendation().Trainers.MatrixFactorization(options));
```
### 2. Train model
-Training the model is a process of running the chosen algorithm on a training data (with known movie and user ratings) to tune the parameters of the model. It is implemented in the `Fit()` method from the Estimator object.
+Training the model is a process of running the chosen algorithm on a training data (with known movie and user ratings) to tune the parameters of the model. It is implemented in the `Fit()` method from the Estimator object.
To perform training you need to call the `Fit()` method while providing the training dataset (`recommendation-ratings-train.csv` file) in a DataView object.
-```CSharp
+```CSharp
ITransformer model = trainingPipeLine.Fit(trainingDataView);
```
Note that ML.NET works with data with a lazy-load approach, so in reality no data is really loaded in memory until you actually call the method .Fit().
### 3. Evaluate model
-We need this step to conclude how accurate our model operates on new data. To do so, the model from the previous step is run against another dataset that was not used in training (`recommendation-ratings-test.csv`).
+We need this step to conclude how accurate our model operates on new data. To do so, the model from the previous step is run against another dataset that was not used in training (`recommendation-ratings-test.csv`).
`Evaluate()` compares the predicted values for the test dataset and produces various metrics, such as accuracy, you can explore.
-```CSharp
+```CSharp
Console.WriteLine("=============== Evaluating the model ===============");
-IDataView testDataView = mlcontext.Data.LoadFromTextFile(TestDataLocation, hasHeader: true);
+IDataView testDataView = mlcontext.Data.LoadFromTextFile(TestDataLocation, hasHeader: true);
var prediction = model.Transform(testDataView);
var metrics = mlcontext.Regression.Evaluate(prediction, labelColumnName: "Label", scoreColumnName: "Score");
```
### 4. Consume model
-After the model is trained, you can use the `Predict()` API to predict the rating for a particular movie/user combination.
-```CSharp
+After the model is trained, you can use the `Predict()` API to predict the rating for a particular movie/user combination.
+```CSharp
var predictionengine = mlcontext.Model.CreatePredictionEngine(model);
var movieratingprediction = predictionengine.Predict(
new MovieRating()
@@ -108,11 +108,11 @@ var movieratingprediction = predictionengine.Predict(
movieId = predictionmovieId
}
);
- Console.WriteLine("For userId:" + predictionuserId + " movie rating prediction (1 - 5 stars) for movie:" +
+ Console.WriteLine("For userId:" + predictionuserId + " movie rating prediction (1 - 5 stars) for movie:" +
movieService.Get(predictionmovieId).movieTitle + " is:" + Math.Round(movieratingprediction.Score,1));
-
+
```
-Please note this is one approach for performing movie recommendations with Matrix Factorization. There are other scenarios for recommendation as well which we will build samples for as well.
+Please note this is one approach for performing movie recommendations with Matrix Factorization. There are other scenarios for recommendation as well which we will build samples for as well.
#### Score in Matrix Factorization
diff --git a/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md b/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
index 23d698f0f..2c888ade7 100644
--- a/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
+++ b/samples/csharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
@@ -6,24 +6,24 @@
In this sample, you can see how to use ML.NET to build a product recommendation scenario.
-The style of recommendation in this sample is based upon the co-purchase scenario or products frequently
+The style of recommendation in this sample is based upon the co-purchase scenario or products frequently
bought together which means it will recommend customers a set of products based upon their purchase order
-history.
+history.
-
+
-In this example, the highlighted products are being recommended based upon a frequently bought together learning model.
+In this example, the highlighted products are being recommended based upon a frequently bought together learning model.
## Problem
-For this tutorial we will use the Amazon product co-purchasing network dataset.
+For this tutorial we will use the Amazon product co-purchasing network dataset.
-In terms of an approach for building our product recommender we will use One-Class Factorization Machines which uses a collaborative filtering approach.
+In terms of an approach for building our product recommender we will use One-Class Factorization Machines which uses a collaborative filtering approach.
The difference between one-class and other Factorization Machines approach we covered is that in this dataset we only have information on purchase order history.
-We do not have ratings or other details like product description etc. available to us.
+We do not have ratings or other details like product description etc. available to us.
Matrix Factorization relies on ‘Collaborative filtering’ which operates under the underlying assumption that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.
@@ -35,7 +35,7 @@ DataSet's Citation information can be found [here](/ProductRecommender/Data/DATA
## Algorithm - [Matrix Factorization (Recommendation)](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks#recommendation)
-The algorithm for this recommendation task is Matrix Factorization, which is a supervised machine learning algorithm performing collaborative filtering.
+The algorithm for this recommendation task is Matrix Factorization, which is a supervised machine learning algorithm performing collaborative filtering.
## Solution
@@ -45,19 +45,19 @@ To solve this problem, you build and train an ML model on existing training data
### 1. Build model
-Building a model includes:
+Building a model includes:
-* Download and copy the dataset Amazon0302.txt file from https://snap.stanford.edu/data/amazon0302.html.
+* Download and copy the dataset Amazon0302.txt file from https://snap.stanford.edu/data/amazon0302.html.
* Replace the column names with only these instead: ProductID ProductID_Copurchased
* Given in the reader we already provide a KeyRange and product ID's are already encoded all we need to do is
- call the MatrixFactorizationTrainer with a few extra parameters.
+ call the MatrixFactorizationTrainer with a few extra parameters.
Here's the code which will be used to build the model:
```CSharp
-
- //STEP 1: Create MLContext to be shared across the model creation workflow objects
+
+ //STEP 1: Create MLContext to be shared across the model creation workflow objects
MLContext mlContext = new MLContext();
//STEP 2: Read the trained data using TextLoader by defining the schema for reading the product co-purchase dataset
@@ -66,14 +66,14 @@ Here's the code which will be used to build the model:
columns: new[]
{
new TextLoader.Column("Label", DataKind.Single, 0),
- new TextLoader.Column(name:nameof(ProductEntry.ProductID), dataKind:DataKind.UInt32, source: new [] { new TextLoader.Range(0) }, keyCount: new KeyCount(262111)),
+ new TextLoader.Column(name:nameof(ProductEntry.ProductID), dataKind:DataKind.UInt32, source: new [] { new TextLoader.Range(0) }, keyCount: new KeyCount(262111)),
new TextLoader.Column(name:nameof(ProductEntry.CoPurchaseProductID), dataKind:DataKind.UInt32, source: new [] { new TextLoader.Range(1) }, keyCount: new KeyCount(262111))
},
hasHeader: true,
separatorChar: '\t');
//STEP 3: Your data is already encoded so all you need to do is specify options for MatrxiFactorizationTrainer with a few extra hyperparameters
- // LossFunction, Alpa, Lambda and a few others like K and C as shown below and call the trainer.
+ // LossFunction, Alpa, Lambda and a few others like K and C as shown below and call the trainer.
MatrixFactorizationTrainer.Options options = new MatrixFactorizationTrainer.Options();
options.MatrixColumnIndexColumnName = nameof(ProductEntry.ProductID);
options.MatrixRowIndexColumnName = nameof(ProductEntry.CoPurchaseProductID);
@@ -89,11 +89,11 @@ Here's the code which will be used to build the model:
var est = mlContext.Recommendation().Trainers.MatrixFactorization(options);
```
-### 2. Train Model
+### 2. Train Model
-Once the estimator has been defined, you can train the estimator on the training data available to us.
+Once the estimator has been defined, you can train the estimator on the training data available to us.
-This will return a trained model.
+This will return a trained model.
```CSharp
@@ -102,11 +102,11 @@ This will return a trained model.
ITransformer model = est.Fit(traindata);
```
-### 3. Consume Model
+### 3. Consume Model
We will perform predictions for this model by creating a prediction engine/function as shown below.
-The prediction engine creation takes in as input the following two classes.
+The prediction engine creation takes in as input the following two classes.
```CSharp
public class Copurchase_prediction
@@ -124,11 +124,11 @@ The prediction engine creation takes in as input the following two classes.
}
```
-Once the prediction engine has been created you can predict scores of two products being co-purchased.
+Once the prediction engine has been created you can predict scores of two products being co-purchased.
```CSharp
//STEP 6: Create prediction engine and predict the score for Product 63 being co-purchased with Product 3.
- // The higher the score the higher the probability for this particular productID being co-purchased
+ // The higher the score the higher the probability for this particular productID being co-purchased
var predictionengine = mlContext.Model.CreatePredictionEngine(model);
var prediction = predictionengine.Predict(
new ProductEntry()
diff --git a/samples/fsharp/common/ConsoleHelper.fs b/samples/fsharp/common/ConsoleHelper.fs
index e2fd0eb0f..ff5269d1e 100644
--- a/samples/fsharp/common/ConsoleHelper.fs
+++ b/samples/fsharp/common/ConsoleHelper.fs
@@ -28,7 +28,7 @@ module ConsoleHelper =
printfn "* Squared loss: %.2f" metrics.MeanSquaredError
printfn "* RMS loss: %.2f" metrics.RootMeanSquaredError
printfn "*************************************************"
-
+
let printBinaryClassificationMetrics name (metrics : CalibratedBinaryClassificationMetrics) =
printfn"************************************************************"
printfn"* Metrics for %s binary classification model " name
@@ -37,7 +37,7 @@ module ConsoleHelper =
printfn"* Area Under Curve: %.2f%%" (metrics.AreaUnderRocCurve * 100.)
printfn"* Area under Precision recall Curve: %.2f%%" (metrics.AreaUnderPrecisionRecallCurve * 100.)
printfn"* F1Score: %.2f%%" (metrics.F1Score * 100.)
-
+
printfn"* LogLogg: %.2f%%" (metrics.LogLoss)
printfn"* LogLossreduction: %.2f%%" (metrics.LogLossReduction)
printfn"* PositivePrecision: %.2f" (metrics.PositivePrecision)
@@ -70,7 +70,7 @@ module ConsoleHelper =
confidenceInterval95
let printMulticlassClassificationFoldsAverageMetrics algorithmName (crossValResults : TrainCatalogBase.CrossValidationResult[]) =
-
+
let metricsInMultipleFolds = crossValResults |> Array.map(fun r -> r.Metrics)
let microAccuracyValues = metricsInMultipleFolds |> Array.map(fun m -> m.MicroAccuracy)
@@ -126,19 +126,19 @@ module ConsoleHelper =
let peekDataViewInConsole<'TObservation when 'TObservation : (new : unit -> 'TObservation) and 'TObservation : not struct> (mlContext : MLContext) (dataView : IDataView) (pipeline : IEstimator) numberOfRows =
-
+
let msg = sprintf "Peek data in DataView: Showing %d rows with the columns" numberOfRows
consoleWriteHeader msg
- //https://github.com/dotnet/machinelearning/blob/master/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
+ //https://github.com/dotnet/machinelearning/blob/main/docs/code/MlNetCookBook.md#how-do-i-look-at-the-intermediate-data
let transformer = pipeline.Fit dataView
let transformedData = transformer.Transform dataView
- // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
+ // 'transformedData' is a 'promise' of data, lazy-loading. call Preview
//and iterate through the returned collection from preview.
transformedData.Preview(numberOfRows).RowView
- |> Seq.iter
+ |> Seq.iter
(fun row ->
row.Values
|> Array.map (function KeyValue(k,v) -> sprintf "| %s:%O" k v)
@@ -155,21 +155,21 @@ module ConsoleHelper =
let transformedData = transformer.Transform dataView
// Extract the 'Features' column.
- let someColumnData =
+ let someColumnData =
transformedData.GetColumn(columnName)
|> Seq.take numberOfRows
|> Seq.toList
// print to console the peeked rows
someColumnData
- |> List.iter(fun row ->
- let concatColumn =
+ |> List.iter(fun row ->
+ let concatColumn =
row
|> Array.map string
|> String.concat ""
printfn "%s" concatColumn
)
-
+
someColumnData;
let consoleWriterSection (lines : string array) =
@@ -182,7 +182,7 @@ module ConsoleHelper =
let maxLength = lines |> Array.map(fun x -> x.Length) |> Array.max
printfn "%s" (new string('-', maxLength))
Console.ForegroundColor <- defaultColor
-
+
let consolePressAnyKey () =
let defaultColor = Console.ForegroundColor
Console.ForegroundColor <- ConsoleColor.Green
@@ -190,7 +190,7 @@ module ConsoleHelper =
printfn "Press any key to finish."
Console.ForegroundColor <- defaultColor
Console.ReadKey() |> ignore
-
+
let consoleWriteException (lines : string array) =
let defaultColor = Console.ForegroundColor
Console.ForegroundColor <- ConsoleColor.Red
diff --git a/samples/fsharp/getting-started/DeepLearning_ImageClassification_Training/README.md b/samples/fsharp/getting-started/DeepLearning_ImageClassification_Training/README.md
index 618ec0b3c..898c5f1d5 100644
--- a/samples/fsharp/getting-started/DeepLearning_ImageClassification_Training/README.md
+++ b/samples/fsharp/getting-started/DeepLearning_ImageClassification_Training/README.md
@@ -4,19 +4,19 @@
|----------------|-------------------|-------------------------------|-------------|-----------|---------------------|---------------------------|-----------------------------|
| Microsoft.ML 1.4 | Dynamic API | Up-to-date | Console app | Image files | Image classification | Image classification with TensorFlow model retrain based on transfer learning | DNN architecture: ResNet |
-## Problem
+## Problem
Image classification is a common problem within the Deep Learning subject. This sample shows how to create your own custom image classifier by training your model based on the transfer learning approach which is basically retraining a pre-trained model (architecture such as InceptionV3 or ResNet) so you get a custom model trained on your own images.
In this sample app you create your own custom image classifier model by natively training a TensorFlow model from ML.NET API with your own images.
-*Image classifier scenario – Train your own custom deep learning model with ML.NET*
+*Image classifier scenario – Train your own custom deep learning model with ML.NET*

## Dataset (Images)
> *Image set license*
>
-> This sample's dataset is based on the 'flower_photos imageset' available from Tensorflow at [this URL](http://download.tensorflow.org/example_images/flower_photos.tgz).
+> This sample's dataset is based on the 'flower_photos imageset' available from Tensorflow at [this URL](http://download.tensorflow.org/example_images/flower_photos.tgz).
> All images in this archive are licensed under the Creative Commons By-Attribution License, available at:
https://creativecommons.org/licenses/by/2.0/
>
@@ -24,7 +24,7 @@ https://creativecommons.org/licenses/by/2.0/
The by default imageset being downloaded by the sample has 200 images evenly distributed across 5 flower classes:
- Images --> flower_photos_small_set -->
+ Images --> flower_photos_small_set -->
|
daisy
|
@@ -36,7 +36,7 @@ The by default imageset being downloaded by the sample has 200 images evenly dis
|
tulips
-The name of each sub-folder is important because that'll be the name of each class/label the model is going to use to classify the images.
+The name of each sub-folder is important because that'll be the name of each class/label the model is going to use to classify the images.
## ML Task - Image Classification
@@ -63,7 +63,7 @@ Building the model includes the following steps:
* Loading the image files (file paths in this case) into an `IDataView`
* Image classification using the ImageClassification estimator (high level API)
-Define the schema of your data as Records. `ImageData` is the original data format and `ImagePrediction` is the output generated by the trained model which contains the original properties of the `ImageData` record as well as the `PredictedLabel`.
+Define the schema of your data as Records. `ImageData` is the original data format and `ImagePrediction` is the output generated by the trained model which contains the original properties of the `ImageData` record as well as the `PredictedLabel`.
```fsharp
// Define input and output schema
@@ -109,23 +109,23 @@ Before the model is trained, the data has to be preprocessed. The `ImageClassifi
```fsharp
// Define preprocessing pipeline
-let preprocessingPipeline =
+let preprocessingPipeline =
EstimatorChain()
.Append(mlContext.Transforms.Conversion.MapValueToKey("LabelAsKey","Label"))
.Append(mlContext.Transforms.LoadRawImageBytes("Image", null, "ImagePath"))
// Preprocess the data
-let preprocessedData =
- let processingTransformer = data |> preprocessingPipeline.Fit
+let preprocessedData =
+ let processingTransformer = data |> preprocessingPipeline.Fit
data |> processingTransformer.Transform
```
Now, using the preprocessed data, let's split the dataset in three datasets: one for training, the second to tune model performance and finally a test set to make predictions.
```fsharp
-let train, validation, test =
+let train, validation, test =
preprocessedData
- |> ( fun originalData ->
+ |> ( fun originalData ->
let trainValSplit = mlContext.Data.TrainTestSplit(originalData, testFraction=0.7)
let testValSplit = mlContext.Data.TrainTestSplit(trainValSplit.TestSet)
(trainValSplit.TrainSet, testValSplit.TrainSet, testValSplit.TestSet))
@@ -145,7 +145,7 @@ classifierOptions.Arch <- ImageClassificationTrainer.Architecture.ResnetV2101
classifierOptions.MetricsCallback <- Action(fun x -> printfn "%s" (x.ToString()))
// Define model training pipeline
-let trainingPipeline =
+let trainingPipeline =
EstimatorChain()
.Append(mlContext.MulticlassClassification.Trainers.ImageClassification(classifierOptions))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel","LabelAsKey"))
@@ -157,10 +157,10 @@ The `mlContext.MulticlassClassification.Trainers.ImageClassification` high level
In order to begin the training process you run `Fit` on the built pipeline:
-```fsharp
+```fsharp
// Train the model
-let trainedModel =
- train
+let trainedModel =
+ train
|> trainingPipeline.Fit
```
@@ -191,16 +191,16 @@ To get a sense of the predictions being made, create an `IEnumerable` from the `
```fsharp
mlContext.Data.CreateEnumerable(predictions, reuseRowObject=true)
|> Seq.take 5
-|> Seq.iter(fun prediction -> printfn "Original: %s | Predicted: %s" prediction.Label prediction.PredictedLabel)
+|> Seq.iter(fun prediction -> printfn "Original: %s | Predicted: %s" prediction.Label prediction.PredictedLabel)
```
## TensorFlow DNN Transfer Learning background information
-This sample app is retraining a TensorFlow model for image classification. As a user, you could think it is pretty similar to this other sample [Image classifier using the TensorFlow Estimator featurizer](https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_TensorFlowEstimator). However, the internal implementation is very different under the covers. In that mentioned sample, it is using a 'model composition approach' where an initial TensorFlow model (i.e. InceptionV3 or ResNet) is only used to featurize the images and produce the binary information per image to be used by another ML.NET classifier trainer added on top (such as `LbfgsMaximumEntropy`). Therefore, even when that sample is using a TensorFlow model, you are training only with a ML.NET trainer, you don't retrain a new TensorFlow model but train an ML.NET model. That's why the output of that sample is only an ML.NET model (.zip file).
+This sample app is retraining a TensorFlow model for image classification. As a user, you could think it is pretty similar to this other sample [Image classifier using the TensorFlow Estimator featurizer](https://github.com/dotnet/machinelearning-samples/tree/main/samples/csharp/getting-started/DeepLearning_TensorFlowEstimator). However, the internal implementation is very different under the covers. In that mentioned sample, it is using a 'model composition approach' where an initial TensorFlow model (i.e. InceptionV3 or ResNet) is only used to featurize the images and produce the binary information per image to be used by another ML.NET classifier trainer added on top (such as `LbfgsMaximumEntropy`). Therefore, even when that sample is using a TensorFlow model, you are training only with a ML.NET trainer, you don't retrain a new TensorFlow model but train an ML.NET model. That's why the output of that sample is only an ML.NET model (.zip file).
In contrast, this sample is natively retraining a new TensorFlow model based on a Transfer Learning approach but training a new TensorFlow model derived from the specified pre-trained model (Inception V3 or ResNet).
-**Benefits:**
+**Benefits:**
- **Train and inference using GPU:**
When using this native DNN approach based on TensorFlow you can either use the CPU or GPU (if available) for a better performance (less time needed for training and scoring).
diff --git a/samples/fsharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md b/samples/fsharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
index 16ae0c1b1..8a3d6cdad 100644
--- a/samples/fsharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
+++ b/samples/fsharp/getting-started/MatrixFactorization_ProductRecommendation/Readme.md
@@ -6,24 +6,24 @@
In this sample, you can see how to use ML.NET to build a product recommendation scenario.
-The style of recommendation in this sample is based upon the co-purchase scenario or products frequently
+The style of recommendation in this sample is based upon the co-purchase scenario or products frequently
bought together which means it will recommend customers a set of products based upon their purchase order
-history.
+history.
-
+
-In this example, the highlighted products are being recommended based upon a frequently bought together learning model.
+In this example, the highlighted products are being recommended based upon a frequently bought together learning model.
## Problem
-For this tutorial we will use the Amazon product co-purchasing network dataset.
+For this tutorial we will use the Amazon product co-purchasing network dataset.
-In terms of an approach for building our product recommender we will use One-Class Factorization Machines which uses a collaborative filtering approach.
+In terms of an approach for building our product recommender we will use One-Class Factorization Machines which uses a collaborative filtering approach.
The difference between one-class and other Factorization Machines approach we covered is that in this dataset we only have information on purchase order history.
-We do not have ratings or other details like product description etc. available to us.
+We do not have ratings or other details like product description etc. available to us.
Matrix Factorization relies on ‘Collaborative filtering’ which operates under the underlying assumption that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person.
@@ -34,7 +34,7 @@ https://snap.stanford.edu/data/amazon0302.html
## ML task - [Matrix Factorization (Recommendation)](https://docs.microsoft.com/en-us/dotnet/machine-learning/resources/tasks#recommendation)
-The ML Task for this sample is Matrix Factorization, which is a supervised machine learning task performing collaborative filtering.
+The ML Task for this sample is Matrix Factorization, which is a supervised machine learning task performing collaborative filtering.
## Solution
@@ -44,34 +44,34 @@ To solve this problem, you build and train an ML model on existing training data
### 1. Build model
-Building a model includes:
+Building a model includes:
-* Download and copy the dataset Amazon0302.txt file from https://snap.stanford.edu/data/amazon0302.html.
+* Download and copy the dataset Amazon0302.txt file from https://snap.stanford.edu/data/amazon0302.html.
* Replace the column names with only these instead: ProductID ProductID_Copurchased
* Given in the reader we already provide a KeyRange and product ID's are already encoded all we need to do is
- call the MatrixFactorizationTrainer with a few extra parameters.
+ call the MatrixFactorizationTrainer with a few extra parameters.
Here's the code which will be used to build the model:
```fsharp
-//STEP 1: Create MLContext to be shared across the model creation workflow objects
+//STEP 1: Create MLContext to be shared across the model creation workflow objects
let mlContext = new MLContext()
//STEP 2: Read the trained data using TextLoader by defining the schema for reading the product co-purchase dataset
// Do remember to replace amazon0302.txt with dataset from https://snap.stanford.edu/data/amazon0302.html
-let traindata =
- let columns =
+let traindata =
+ let columns =
[|
TextLoader.Column("Label", DataKind.Single, 0)
- TextLoader.Column("ProductID", DataKind.UInt32, source = [|TextLoader.Range(0)|], keyCount = KeyCount 262111UL)
- TextLoader.Column("CoPurchaseProductID", DataKind.UInt32, source = [|TextLoader.Range(1)|], keyCount = KeyCount 262111UL)
+ TextLoader.Column("ProductID", DataKind.UInt32, source = [|TextLoader.Range(0)|], keyCount = KeyCount 262111UL)
+ TextLoader.Column("CoPurchaseProductID", DataKind.UInt32, source = [|TextLoader.Range(1)|], keyCount = KeyCount 262111UL)
|]
mlContext.Data.LoadFromTextFile(trainDataPath, columns, hasHeader=true, separatorChar='\t')
//STEP 3: Your data is already encoded so all you need to do is specify options for MatrxiFactorizationTrainer with a few extra hyperparameters
-// LossFunction, Alpa, Lambda and a few others like K and C as shown below and call the trainer.
-let options = MatrixFactorizationTrainer.Options(MatrixColumnIndexColumnName = "ProductID",
+// LossFunction, Alpa, Lambda and a few others like K and C as shown below and call the trainer.
+let options = MatrixFactorizationTrainer.Options(MatrixColumnIndexColumnName = "ProductID",
MatrixRowIndexColumnName = "CoPurchaseProductID",
LossFunction = MatrixFactorizationTrainer.LossFunctionType.SquareLossOneClass,
LabelColumnName = "Label",
@@ -86,11 +86,11 @@ let options = MatrixFactorizationTrainer.Options(MatrixColumnIndexColumnName = "
let est = mlContext.Recommendation().Trainers.MatrixFactorization(options)
```
-### 2. Train Model
+### 2. Train Model
-Once the estimator has been defined, you can train the estimator on the training data available to us.
+Once the estimator has been defined, you can train the estimator on the training data available to us.
-This will return a trained model.
+This will return a trained model.
```fsharp
//STEP 5: Train the model fitting to the DataSet
@@ -98,15 +98,15 @@ This will return a trained model.
let model = est.Fit(traindata)
```
-### 3. Consume Model
+### 3. Consume Model
We will perform predictions for this model by creating a prediction engine/function as shown below.
-The prediction engine creation takes in as input the following two classes.
+The prediction engine creation takes in as input the following two classes.
```fsharp
[]
-type ProductEntry =
+type ProductEntry =
{
[]
ProductID : uint32
@@ -115,16 +115,16 @@ type ProductEntry =
[]
Label : float32
}
-
+
[]
type Prediction = {Score : float32}
```
-Once the prediction engine has been created you can predict scores of two products being co-purchased.
+Once the prediction engine has been created you can predict scores of two products being co-purchased.
```fsharp
//STEP 6: Create prediction engine and predict the score for Product 63 being co-purchased with Product 3.
-// The higher the score the higher the probability for this particular productID being co-purchased
+// The higher the score the higher the probability for this particular productID being co-purchased
let predictionengine = mlContext.Model.CreatePredictionEngine(model)
let prediction = predictionengine.Predict {ProductID = 3u; CoPurchaseProductID = 63u; Label = 0.f}
```