You are given a training dataset, in which each entry is a features vector (an array of 2 real numbers) and a label 0 or 1 indicating the class to which this vector belongs.
Your goal is to use this dataset to train a quantum classification model that will accurately classify a validation dataset - a different dataset generated using the same data distribution as the training one. The error rate of classifying the validation dataset using your model (the percentage of incorrectly classified samples) should be less than 5%.
Your code will not be given any inputs. Instead, you should use the provided dataset file to train your model.
The training dataset is represented as a JSON file and consists of two arrays, "Features" and "Labels". Each array has exactly 400 elements. Each element of the "Features" array is an array with 2 elements, each of them a floating-point number. Each element of the "Labels" array is the label of the class to which the corresponding element of the "Features" array belongs, 0 or 1.
Your code should return the description of the model you'd like to use in the following format:
Your code should have the following signature:
namespace Solution {
open Microsoft.Quantum.MachineLearning;
operation Solve () : ((Int, Double[]), ControlledRotation[], (Double[], Double)) {
// your code here
}
}
Classical preprocessing
This step allows you to add new features to the data before encoding it in the quantum state and feeding it into the classifier circuit. To do this, you need to pick one of the available preprocessing methods and return a tuple of its index and its parameters. The parameters of all methods are Double[].
After the preprocessing step the resulting data is encoded in the quantum state using amplitudes encoding: element $$$j$$$ of the data is encoded in the amplitude of basis state $$$|j\rangle$$$. If the length of the data array is not a power of 2, it is right-padded with $$$0$$$s to the nearest power of two; the number of qubits used for encoding is the exponent of that power.
Note that majority of the data analysis is going to happen "offline" before you submit the solution. The solution has to contain only the description of the trained model, not the training code itself - if you attempt to train the model "online" in your submitted code during the evaluation process, it will very likely time out.
Training your model offline is likely to involve:
Name |
---|