CS Jou Blog: 7月 2018

2018年7月17日星期二

ML.NET 教學123 翻譯-1

Goolge 翻譯 Readme
1-1
＃用戶評論的情感分析
在這個介紹性的示例中，您將看到如何使用[ML.NET]（https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet）來預測情緒（正面或負面）用於客戶評論。
在機器學習的世界中，這種類型的預測被稱為**二元分類**。
##問題
這個問題集中在預測客戶的評論是否具有正面或負面情緒。我們將使用由人處理的IMDB(網路電影資料庫)和Yelp(群眾評論論壇平台公司)評論，並為每個評論給定了一個標籤：
* 0 - 否定
* 1 - 積極

使用這些數據集，我們將構建一個模型，用於分析字符串並預測情緒值為0或1。

## ML任務 - 二元分類
**二元分類**的廣義問題是將項分類為兩個類中的一個（將項分類為兩個以上的類稱為**多類分類**）。例如：

*預測保險索賠是否有效。
*預測飛機是否會延遲或準時到達。
*預測面部ID（照片）是否屬於設備的所有者。

所有這些示例的共同特徵是我們想要預測的參數只能採用兩個值中的一個。換句話說，該值由`boolean`類型表示。

##解決方案
要解決這個問題，首先我們將構建一個ML模型。然後我們將對現有數據進行模型訓練，評估其有多好，最後我們將使用該模型來預測新評論的情緒。

！[Build - > Train - > Evaluate - > Consume]（https://github.com/dotnet/machinelearning-samples/raw/master/samples/getting-started/shared_content/modelpipeline.png）

### 1.構建模型

構建模型包括：上傳數據（`sentiment-imdb-train.txt`和`TextLoader`），轉換數據，以便ML算法（使用`TextFeaturizer`）有效地使用，並選擇學習算法（` FastTreeBinaryClassifier`）。所有這些步驟都存儲在`LearningPipeline`中：
C# 註解與程式碼
// LearningPipeline類別(class)包含學習過程的所有步驟：數據，轉換，學習物件。
var pipeline = new LearningPipeline();
// TextLoader加載數據集。通過傳遞包含的類來指定數據集的模式
//包含所有列名稱及其類型。
pipeline.Add(new TextLoader(TrainDataPath).CreateFrom <SentimentData>());
// TextFeaturizer是一個轉換類別，用於對輸入列進行特徵化以格式化和整理數據。
pipeline.Add（new TextFeaturizer（“Features”，“SentimentText”））;
// FastTreeBinaryClassifier是一種用於訓練模型的算法。
//它有三個超參數用於調整決策樹性能。
pipeline.Add（new FastTreeBinaryClassifier（）{NumLeaves = 5，NumTrees = 5，MinDocumentsInLeafs = 2}）;
```
### 2.訓練模型
訓練模型是在訓練數據（具有已知情緒值）上運行所選算法以調整模型參數的過程。它在`Train（）`API中實現。為了執行訓練，我們只需調用方法並為我們的數據對象`SentimentData`和預測對象`SentimentPrediction`提供類型。
```CSHARP
var model = pipeline.Train <SentimentData，SentimentPrediction>（）;
```
### 3.評估模型
我們需要這一步來總結我們的模型對新數據的準確性。為此，上一步中的模型針對另一個未在訓練中使用的數據集（`sentiment-yelp-test.txt`）運行。此數據集還包含已知情緒。 `BinaryClassificationEvaluator`計算已知票價與模型在各種指標中預測的值之間的差異。
C# 註解與程式碼
var testData = new TextLoader（TestDataPath）.CreateFrom <SentimentData>（）;

var evaluationator = new BinaryClassificationEvaluator（）;
var metrics = evaluationator.Evaluate（model，testData）;
```
> *要了解有關如何理解指標的更多信息，請查看[ML.NET指南]（https://docs.microsoft.com/en-us/dotnet/machine-learning/）中的機器學習詞彙表或使用任何有關數據科學和機器學習的材料*。

如果您對模型的質量不滿意，可以採用多種方法對其進行改進，這些方法將在* examples *類別中介紹。

> *請記住，對於此樣本，質量低於可能的質量，因為為了性能目的，數據集的大小已經減小。您可以使用在線提供的更大標籤情緒數據集來顯著提高質量。*

### 4.使用模型
在訓練模型之後，我們可以使用`Predict（）`API來預測新評論的情緒。

C# 註解與程式碼
var predictions = model.Predict（TestSentimentData.Sentiments）;
```
其中`TestSentimentData.Sentiments`包含我們想要分析的新用戶評論。

C# 註解與程式碼
internal static readonly IEnumerable<SentimentData> Sentiments = new[]
{
new SentimentData
{
SentimentText = "Contoso's 11 is a wonderful experience",
Sentiment = 0
},
new SentimentData
{
SentimentText = "The acting in this movie is very bad",
Sentiment = 0
},
new SentimentData
{
SentimentText = "Joe versus the Volcano Coffee Company is a great film.",
Sentiment = 0
}
};

ML.NET 教學課程 1 2 3

ASUS X450J @ Windows 10 x64 + Visual Studio 2017
https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet/get-started/windows
1. Hello ML.NET
1-1 cmd
1-2 dotnet new console -o myApp
1-3 cd myApp
1-4 dotnet add package Microsoft.ML --version 0.3.0
1-6 Program.cs
using System; using Microsoft.ML; using Microsoft.ML.Data; using Microsoft.ML.Runtime.Api; using Microsoft.ML.Trainers; using Microsoft.ML.Transforms; namespace mlex1 { class Program { // STEP 1: Define your data structures // IrisData is used to provide training data, and as // input for prediction operations // - First 4 properties are inputs/features used to predict the label // - Label is what you are predicting, and is only set when training public class IrisData { [Column("0")] public float SepalLength; [Column("1")] public float SepalWidth; [Column("2")] public float PetalLength; [Column("3")] public float PetalWidth; [Column("4")] [ColumnName("Label")] public string Label; } // IrisPrediction is the result returned from prediction operations public class IrisPrediction { [ColumnName("PredictedLabel")] public string PredictedLabels; } static void Main(string[] args) { // STEP 2: Create a pipeline and load your data var pipeline = new LearningPipeline(); // If working in Visual Studio, make sure the 'Copy to Output Directory' // property of iris-data.txt is set to 'Copy always' string dataPath = "iris-data.txt"; pipeline.Add(new TextLoader(dataPath).CreateFrom<IrisData>(separator: ',')); // STEP 3: Transform your data // Assign numeric values to text in the "Label" column, because only // numbers can be processed during model training pipeline.Add(new Dictionarizer("Label")); // Puts all features into a vector pipeline.Add(new ColumnConcatenator("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth")); // STEP 4: Add learner // Add a learning algorithm to the pipeline. // This is a classification scenario (What type of iris is this?) pipeline.Add(new StochasticDualCoordinateAscentClassifier()); // Convert the Label back into original text (after converting to number in step 3) pipeline.Add(new PredictedLabelColumnOriginalValueConverter() { PredictedLabelColumn = "PredictedLabel" }); // STEP 5: Train your model based on the data set var model = pipeline.Train<IrisData, IrisPrediction>(); // STEP 6: Use your model to make a prediction // You can change these numbers to test different predictions var prediction = model.Predict(new IrisData() { SepalLength = 3.3f, SepalWidth = 1.6f, PetalLength = 0.2f, PetalWidth = 5.1f, }); Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}"); } } }

1-7 Download iris-data.txt

1-8

Ref:https://docs.microsoft.com/zh-tw/dotnet/machine-learning/tutorials/sentiment-analysis

2. 情感分析二元分類案例

https://github.com/dotnet/machinelearning-samples

2-1 download zip
2-2 unzip
2-3 copy datasets folder into project
2-4 run

3. Clustering_iris
3-1 Download (as previous case)
3-2 copy copy datasets folder into project
3-3 run

2018年7月11日星期三

Javascript 3D + Audio - 1

整合 P5.Sound + a-frame.io
Ref: cdn p5.js and p5.sound.js
cdn : https://cdnjs.com/libraries/p5.js/
Ref: a-frame https://aframe.io/docs/0.8.0/introduction/
1. p5.js example
1-1
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.1/p5.js"></script>
1-2 初始化設定
1-2-1 function setup() 環境設定
1-2-2 function draw() 定時重繪畫面
1-3 p5.sound 初始化
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.1/addons/p5.sound.min.js"></script>
1-3-1 function preload() 載入mp3 file
1-3-2 配合 p5.js setup()與draw() 處理聲音同步
1-4 a-frame 3D 架構
<script src="https://aframe.io/releases/0.8.0/aframe.min.js"></script>
1-4-1 3D 場景
<body>內建置<a-scene>場景
1-4-2 draw 取得amp與3D屬性同步修改

1-5 範例
<html>
<head>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.1/p5.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.1/addons/p5.sound.min.js"></script>
<script src="https://aframe.io/releases/0.8.0/aframe.min.js"></script>
<script>
function preload(){
sound = loadSound('assets/aa1.mp3');
}
function setup() {
amplitude = new p5.Amplitude();

// start / stop the sound when canvas is clicked
document.addEventListener("click", function(){
if (sound.isPlaying() ){
sound.stop();
} else {
sound.play();
}
});
}
function draw() {
var level = amplitude.getLevel();
var sceneEl = document.querySelector('a-scene');
var sphere1 = sceneEl.querySelector('#sphere1');
sphere1.setAttribute('radius', level * 20);
}
</script>
</head>
<body>

<a-scene>
<a-box color="#4CC3D9" position="-1 0.5 -3" rotation="0 45 0"></a-box>
<a-sphere color="#EF2D5E" id="sphere1" position="0 1.25 -5" radius="1.25"></a-sphere>
<a-cylinder color="#FFC65D" height="1.5" position="1 0.75 -3" radius="0.5"></a-cylinder>
<a-plane color="#7BC8A4" height="4" position="0 0 -4" rotation="-90 0 0" width="4"></a-plane>
<a-sky color="#ECECEC"></a-sky>
</a-scene>
</body>
</html>

2018年7月17日 星期二

ML.NET 教學123 翻譯-1

ML.NET 教學課程 1 2 3

2018年7月11日 星期三

Javascript 3D + Audio - 1

2018年7月17日星期二

2018年7月11日星期三