wip

6672366a · David Kreplin · 40a33657 · 6672366a · 6672366a
Commit 6672366a authored 11 months ago by David Kreplin
--- a/demonstrator.ipynb
+++ b/demonstrator.ipynb
@@ -12,7 +12,7 @@
    "Author: Frederic Rapp, Fraunhofer IPA, frederic.rapp@ipa.fraunhofer.de and David Kreplin, Fraunhofer IPA david.kreplin@ipa.fraunhofer.de\n",
    "\n",
    "---\n",
-    "<img src=\"images/sequoia_end_to_end_logo.svg\" width=\"500\">\n",
+    "<img src=\"./images/sequoia_end_to_end_logo.svg\" width=\"500\">\n",
    "\n",
    "## Description\n",
    "\n",
@@ -26,7 +26,7 @@
    "In general, reinforcement learning algorithms are based on the interaction of an agent with its environment. The agents observes a state of the environment and takes an action based on its policy. For each action the agent receives a feedback in form of a reward from the environment. The agents goal is to optimize its policy in order to gain maximum long-term reward.\n",
    "The MuZero alogrithm belongs into the category of model-based reinforcement learning algorithms, and it combines a Monte Carlo Tree Search (MCTS) with a learned model. One great advantage when using a MuZero model, is that there are no direct constraints or requirements for the hidden-state to capure all the information of the environment. This makes the algorithm especially suitable to scenarios in which the environments dynamics are not fully known (such as for QML problems).\n",
    "\n",
-    "<img src=\"images/conceptual_layout.png\" width=\"900\">\n",
+    "<img src=\"./images/conceptual_layout.png\" width=\"900\">\n",
    "\n",
    "The conceputual layout graphic shows the basic workflow of the automatic feature map generation framework. \n",
    "The current feature map configuration is observed by the agent, and then transformed into a hidden state representation. This hidden state representation is processed by a MCTS (gaining hidden rewards, value-, and policy functions). This is done by training classical neural networks, and the hidden state is updated iteratively. After the MCTS is propagated to a predefined depth a final policy is created and a final action (in form of the next gate of the quantum circuit) is sampled from the policy. \n",

 %% Cell type:markdown id: tags:

 # Automatic Quantum Feature Map Generation


 ***This demonstrator is part of the project [SEQUOIA End-to-End](https://websites.fraunhofer.de/sequoia/) of the [Competence Center Quantum Computing Baden-Württemberg](https://www.iaf.fraunhofer.de/en/networkers/KQC.html).***

 Author: Frederic Rapp, Fraunhofer IPA, frederic.rapp@ipa.fraunhofer.de and David Kreplin, Fraunhofer IPA david.kreplin@ipa.fraunhofer.de

 ---
-<img src="images/sequoia_end_to_end_logo.svg" width="500">
+<img src="./images/sequoia_end_to_end_logo.svg" width="500">

 ## Description

 In this Notebook, we deomnstrate the ability to create problem-tailored quantum feature maps with a reinforcement learning algorithm.
 When training Machine Learning (ML) models, it is already known that the model architecture has to be designed specifically for the problem at hand. Random models in general do not generalize well on unseen data.
 When using a Quantum Machine Learning (QML) model, the essential building block of the model will be the parameterized quantum circuit (PQC) / quantum feature map. In the contemporary literature and research most feature map architectures depend on ad-hoc approaches. In order to gain a better performance we use a reinforcement learning algorithm to create a problem specific circuit dependent on the given data problem.

 ## Method

 We use a powerful reinforcement learning method for the automatic generation of quantum feature maps, the MuZero algorithm [1].
 In general, reinforcement learning algorithms are based on the interaction of an agent with its environment. The agents observes a state of the environment and takes an action based on its policy. For each action the agent receives a feedback in form of a reward from the environment. The agents goal is to optimize its policy in order to gain maximum long-term reward.
 The MuZero alogrithm belongs into the category of model-based reinforcement learning algorithms, and it combines a Monte Carlo Tree Search (MCTS) with a learned model. One great advantage when using a MuZero model, is that there are no direct constraints or requirements for the hidden-state to capure all the information of the environment. This makes the algorithm especially suitable to scenarios in which the environments dynamics are not fully known (such as for QML problems).

-<img src="images/conceptual_layout.png" width="900">
+<img src="./images/conceptual_layout.png" width="900">

 The conceputual layout graphic shows the basic workflow of the automatic feature map generation framework.
 The current feature map configuration is observed by the agent, and then transformed into a hidden state representation. This hidden state representation is processed by a MCTS (gaining hidden rewards, value-, and policy functions). This is done by training classical neural networks, and the hidden state is updated iteratively. After the MCTS is propagated to a predefined depth a final policy is created and a final action (in form of the next gate of the quantum circuit) is sampled from the policy.
 Each feature map configuration is plugged into a Projected Quantum Kernel (PQK) [2], and the kernel is used in a QML model. The performance of the QML model on the training data serves as reward for the reinforcement learning algorithm.

 NOTE: Reinforcement learning in general can have quite high computational costs, and in general its best to run the MuZero algorithm on GPUs if available.


 ## Further information and contact
 [Fraunhofer IPA Team Quantum Computing](https://www.ipa.fraunhofer.de/quantum)

 ## General Imports

 %% Cell type:code id: tags:

 ``` python
 import numpy as np

 # Get the MuZero and the Automatic Feature Map generator environment
 from muzero.muzero import MuZero
 from games.the_env_score_california_housing import StorageManager, ScoreEnv, create_game
 from games.config_file_chousing import MuZeroConfig

 import sys
 from muzero import self_play
 sys.modules["self_play"] = self_play

 from helper.demo_helper import get_animation

 # Set-up the matplotlib environment
 from static_ffmpeg import add_paths
 add_paths()

 import matplotlib.pyplot as plt
 %matplotlib notebook
 plt.ioff()
 ```

 %% Output

    <matplotlib.pyplot._IoffContext at 0x1a3e30f02b0>

 %% Cell type:markdown id: tags:

 # Data Description

 In this demonstrator, we perform the automatic quantum feature map generation on the well known California Housing dataset.
 The dataset contains a regression problem, the prediction of the respective house value.
 The dataset contains 8 features consisting of information like, the age of the house, the average number of rooms, geographical position, etc.

 In general the dataset consists of 20,640 samples. Due to a restriction in computational resources for QML, we use a subsample 1000 data points.

 We split the dataset into 700 training points and 300 testing points. Additionaly due to the nature of QML models, we scale the all the data within a range of [-1,1].

 %% Cell type:code id: tags:

 ``` python
 from sklearn.datasets import fetch_california_housing
 from sklearn.preprocessing import MinMaxScaler
 from sklearn.model_selection import train_test_split

 # Load data
 housing = fetch_california_housing()
 X = housing.data
 y = housing.target

 # Number of considered samples
 desired_num_samples = 10  # Adjust this as needed

 # Randomly sample data points
 # set the seed to ensure reproducibility
 np.random.seed(42)
 random_indices = np.random.choice(len(X), desired_num_samples, replace=False)
 X_subset = X[random_indices]
 y_subset = y[random_indices]

 # Split the scaled data into training and testing sets
 X_train, X_test, Y_train, Y_test = train_test_split(X_subset, y_subset, test_size=0.3, random_state=42)
 scaler = MinMaxScaler()
 X_train = scaler.fit_transform(X_train)
 X_test = scaler.transform(X_test)
 Y_train = scaler.fit_transform(Y_train.reshape(-1, 1)).reshape(-1)
 Y_test = scaler.transform(Y_test.reshape(-1, 1)).reshape(-1)
 ```

 %% Cell type:markdown id: tags:

 # Model specifications

 In the following, we have to define our environment.
 We need to specify the number of qubits we want to use (which is in general the number of features or n*(numer of features)).
 Addtitonally, we need to define the maximum depth for the feature map, this parameter also serves as maximum depth of the MCTS.
 As QML model we use a Quantum Support Vector Regression (QSVR) algorithm.

 The environment, as well as the configuration files allow for a variety of changes in the hyperparameters of the algorithm.
 Here we use a pre-trained model, feel free to add additional training steps by editing the num_steps parameter (but be aware that a high number of training steps comes with high additional computational cost).

 %% Cell type:code id: tags:

 ``` python
 num_steps = 0 # Set to 0 to only test the model,
              # larger values will continue the training of the model

 result_dict = {"best_fm_str": list(), "cv_score": list()}
 storage = StorageManager(result_dict=result_dict, filename="chousing_cvscore_lac_new_scaling.json")

 # Set-up the game environment
 num_qubits = 8
 max_num_gates = 10
 auto_qnn_env = ScoreEnv(
                qml_model='qsvr',
                scoring='neg_mean_squared_error',
                num_qubits=num_qubits,
                num_features=8,
                max_depth=max_num_gates,
                training_data=X_train,
                training_labels=Y_train,
                storage=storage,
                random_start=-1.0,
                pure_test = (num_steps == 0)
                )


 # Set the most important parameters for the MuZero environment
 logdir = 'chousing_cvscore_lac_new_scaling/results'
 config = MuZeroConfig(logdir=logdir)
 config.observation_shape = (1, 1, max_num_gates)
 config.action_space = list(range(len(auto_qnn_env.actions)))
 config.max_moves = max_num_gates
 config.num_simulations = max_num_gates
 config.training_steps = 750000 + num_steps
 config.checkpoint_interval = 100
 config.batch_size = 64
 config.lr_init = 0.5
 config.lr_decay_rate = 0.9
 config.lr_decay_steps = 5000
 config.root_dirichlet_alpha = 0.3
 config.root_exploration_fraction = 0.3
 config.pb_c_init = 100000 # exploration hyperparameter, larger value means more exploration
 ```

 %% Cell type:markdown id: tags:

 In the next cell we load the pre-trained model.

 %% Cell type:code id: tags:

 ``` python
 # Initialize the MuZero model and load the pre-trained model from disk
 game = create_game(env=auto_qnn_env)
 muzero = MuZero(game_name=game, config=config)
 muzero.load_model(checkpoint_path='chousing_cvscore_lac_new_scaling/results/results/model.checkpoint',
                  replay_buffer_path='chousing_cvscore_lac_new_scaling/results/results/replay_buffer.pkl')
 ```

 %% Output

    2024-03-27 15:53:24,139	INFO worker.py:1724 -- Started a local Ray instance.

    
    Using checkpoint from chousing_cvscore_lac_new_scaling\results\results\model.checkpoint
    
    Initializing replay buffer with chousing_cvscore_lac_new_scaling\results\results\replay_buffer.pkl

 %% Cell type:markdown id: tags:

 After loading the model can be trained further.

 %% Cell type:code id: tags:

 ``` python
 # Train the model if num_steps > 0
 muzero.train()
 ```

 %% Output

    
    Training...
    Run tensorboard --logdir ./results and go to http://localhost:6006/ to see in real time the training performance.
    
    Last test reward: 143.00. Training step: 750000/750000. Played games: 18. Loss: 81.32
    Shutting down workers...
    
    
    Persisting replay buffer games to disk at C:\Users\DKR\Documents\Git Repositories\automatic-feature-map-generation\chousing_cvscore_lac_new_scaling\results\results\replay_buffer.pkl

    [36m(ReplayBuffer pid=29100)[0m Replay buffer initialized with 165 samples (18 games).
    [36m(ReplayBuffer pid=29100)[0m
    [36m(Trainer pid=31184)[0m You are not training on GPU.
    [36m(Trainer pid=31184)[0m Loading optimizer...

 %% Cell type:markdown id: tags:

 When the trained model converges (this can be seen when analyzing the seperate tensorboard ouputs) one can start calling the test method instead of the training routine, the MuZero networks weights are kept fixed and won't be trained any further. This allows to reduce the computational cost for the upcoming generations of generated feature maps.
 The num_tests parameter controlls the amount of test runs. In general the more runs the better (since you will obtain even more feature map configurations to compare), since compuational resources for this notebook will be restricted we suggest to keep the number of test runs low.
 The generation of the feature maps is visualized in the following animation, to allow the user to get some intuition about the generating process.

 %% Cell type:code id: tags:

 ``` python
 # Run test and create an animation of the created quantum feature maps
 test = muzero.test(render=True,num_tests=3)
 ```

 %% Output

    Testing 1/3
    Testing 2/3
    Testing 3/3

 %% Cell type:code id: tags:

 ``` python
 get_animation()
 ```

 %% Output

    <IPython.core.display.HTML object>

 %% Cell type:markdown id: tags:

 Now we can load the result of the MuZero search algorithm (training & testing) to get all the generated feature maps which lead to a reward.
 We will then extract the feature map which lead to the best solution (in this case to the best cross-validation score on the training data).

 %% Cell type:code id: tags:

 ``` python
 # Load results from training and testing and obtain the best found quantum feature map
 load_result = game.import_storage("chousing_cvscore_lac_new_scaling.json")
 best_fm = ''.join(load_result["best_fm_str"])
 ```

 %% Cell type:markdown id: tags:

 # Run QML model with tailored feature map

 In the final step, we use the best created feature map to perform a final QSVR on the test data!
 We compare the result to a result of a QSVR based on a popular ad-hoc feature map, and addtionaly to a classical SVR.
 For all models we perform a hyperparameter optimization using a grid search approach.

 We use our in-house software package sQUlearn [3] for the QML methods. It allows for easy quantum computing backend inclusion with its Executor class, and the LayeredEncodingCircuit class comes in quite handy when building custom circuits architectures / quantum feature maps.
 The framework also allows for easy high level implementation of QML models, rangeing from quantum kernel methods to quantum neural networks.

 NOTE: The inference of the QML method takes up to 10 minutes for each model, due to the fairly high amout of data points for a quantum kernel method.

 %% Cell type:code id: tags:

 ``` python
 from squlearn.encoding_circuit import LayeredEncodingCircuit, HubregtsenEncodingCircuit
 from squlearn.kernel.matrix import ProjectedQuantumKernel
 from squlearn.util import Executor
 from sklearn.svm import SVR

 # Switch matlab mode back to inline
 %matplotlib inline

 # Create the generated feature map
 generated_feature_map = LayeredEncodingCircuit.from_string(encoding_circuit_str=best_fm,
                                                           num_qubits=num_qubits,
                                                           num_features=8)
 # Initialize the ad-hoc feature map
 ad_hoc_feature_map = HubregtsenEncodingCircuit(num_qubits=num_qubits, num_features=8)

 # Create the quantum kernels
 quantum_kernel = ProjectedQuantumKernel(generated_feature_map, executor=Executor(), gamma=0.5)
 quantum_kernel_ad_hoc = ProjectedQuantumKernel(ad_hoc_feature_map, executor=Executor(), gamma=0.5)

 # Evaluate trainings and test kernel matrices
 trainings_matrix = quantum_kernel.evaluate(X_train, X_train)
 test_matrix = quantum_kernel.evaluate(X_test, X_train)
 trainings_matrix_ad_hoc = quantum_kernel_ad_hoc.evaluate(X_train, X_train)
 test_matrix_ad_hoc = quantum_kernel_ad_hoc.evaluate(X_test, X_train)
 ```

 %% Cell type:markdown id: tags:

 In the next step we will perform the grid search for all three models (QSVR with generated feature map, QSVR with ad-hoc feature map, classical SVR).
 And plot the results on the test set in order to compare the performance of the models.

 %% Cell type:code id: tags:

 ``` python
 from sklearn.model_selection import GridSearchCV
 param_grid = {
    'C': [0.01, 0.1, 1, 10, 50, 100, 200, 300, 400, 500],
    'epsilon': [1e-6,  1e-4, 0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 10]
 }
 # SVR with precomputed kernel of the generated feature map
 svr_generated = SVR(kernel='precomputed')
 grid_search = GridSearchCV(svr_generated, param_grid, cv=5, n_jobs=-1, scoring='neg_mean_squared_error')
 grid_search.fit(trainings_matrix, Y_train)
 print('best parameters for SVR with simulated Hubregtsen kernel:', grid_search.best_params_)
 svr_generated = grid_search.best_estimator_

 # SVR with precomputed kernel of the ad-hoc feature map
 svr_adhoc = SVR(kernel='precomputed')
 grid_search_ad_hoc = GridSearchCV(svr_adhoc, param_grid, cv=5, n_jobs=-1, scoring='neg_mean_squared_error')
 grid_search_ad_hoc.fit(trainings_matrix_ad_hoc, Y_train)
 print('best parameters for SVR with ad-hoc kernel:', grid_search_ad_hoc.best_params_)
 svr_adhoc = grid_search_ad_hoc.best_estimator_

 # SVR with the classical RBF-kernel
 svr_classical = SVR(kernel='rbf')
 grid_search_classical = GridSearchCV(svr_classical, param_grid, cv=5, n_jobs=-1, scoring='neg_mean_squared_error')
 grid_search_classical.fit(X_train, Y_train)
 print('best parameters for classical SVR:', grid_search_classical.best_params_)
 svr_classical = grid_search_classical.best_estimator_

 # Plot the results besides each other
 fig, axs = plt.subplots(1, 3, figsize=(15, 3))

 # Create a list of the models, so we can iterate through them below
 models = [
    svr_classical,
    svr_adhoc,
    svr_generated
 ]

 # Iterate through the models and draw a scatter plot
 for ax, model in zip(axs, models):
    if model == svr_classical:
        ax.scatter(Y_test, model.predict(X_test), label=f"R2={model.score(X_test,Y_test):.2f}")
    if model == svr_adhoc:
        ax.scatter(Y_test, model.predict(test_matrix_ad_hoc), label=f"R2={model.score(test_matrix_ad_hoc,Y_test):.2f}")
    if model == svr_generated:
        ax.scatter(Y_test, model.predict(test_matrix), label=f"R2={model.score(test_matrix,Y_test):.2f}")
    ax.plot([np.min(Y_test),np.max(Y_test)],[np.min(Y_test),np.max(Y_test)], color='red')
    ax.legend(loc="upper left")
    ax.set_xlabel("True")
    if ax == axs[0]:
        ax.set_ylabel("Predicted")
 axs[0].set_title("Classical SVR")
 axs[1].set_title("QSVR based on ad-hoc feature map")
 axs[2].set_title("QSVR based on generated feature map")

 ```

 %% Output

    best parameters for SVR with simulated Hubregtsen kernel: {'C': 0.01, 'epsilon': 1e-06}
    best parameters for SVR with ad-hoc kernel: {'C': 10, 'epsilon': 0.0001}
    best parameters for classical SVR: {'C': 0.1, 'epsilon': 1e-06}

    Text(0.5, 1.0, 'QSVR based on generated feature map')



 %% Cell type:markdown id: tags:

 ## Discussion

 In this notebook, we provided a demonstration of the automatic quantum feature map generation framework. We use the state-of-the-art MuZero search algorithm which proves to be very succesful in building problem-tailored feature map designs.
 The result above shows that the QML model based on a generated feature map leads to superior performance in comparance to a QML model based on an ad-hoc approach from contemporary literature [4].

 %% Cell type:markdown id: tags:

 ## References
 [1] Schrittwieser, J., Antonoglou, I., Hubert, T. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4

 [2] Huang, HY., Broughton, M., Mohseni, M. et al. Power of data in quantum machine learning. Nat Commun 12, 2631 (2021). https://doi.org/10.1038/s41467-021-22539-9

 [3] Kreplin, D.A., Willmann, M., Schnabel, J., Rapp, F., Roth, M. sQUlearn – A Python Library for Quantum Machine Learning (2023). https://doi.org/10.48550/arXiv.2311.08990

 [4] Thomas Hubregtsen, David Wierichs, Elies Gil-Fuster, Peter-Jan H. S. Derks, Paul K. Faehrmann, and Johannes Jakob Meyer
 Phys. Rev. A 106, 042431 – Published 20 October 2022

--- a/helper/demo_helper.py
+++ b/helper/demo_helper.py
@@ -65,4 +65,4 @@ def get_animation():
    # Create animation
    ani = animation.FuncAnimation(fig, update, frames=len(frames), interval=1000)

-    return HTML('<div style="height: 500px; display: block; overflow: scroll; background-color: #475;">'+ani.to_html5_video()+'</div>')
+    return HTML(ani.to_html5_video())