Check for reliable results

assigned to @nik20652

marked the checklist item prüfen, ob überlappende Fenster im Training verwendet wurden (JA Stand jetzt, hilfreich für langsame rosbags gleicher Dateninput), as completed

Hi Niki, I checked the batches in tests/test_dyn_slice.py::test_dloader_fake, pt 2 looks OK.

Checking for queue_size in ros_datahandler. In commit de607ab8 (latest on main here), we have ros_datahandler/ros_datahandler/datahandler.py::L70 defining the subscriber factory with queue_size=1

marked the checklist item check queue size (in main, passt es! queue size = 1) --> vor allem update ros_datahandler as completed

Loss fn:

ros_adetector.py:L112: jnp.mean(jnp.abs(x,y))
utils/loss_calculation.py::L54 optax.l2_loss(x,y)

Would cause scaling differences and slight differences in form - if the training loss is what we saw in the plots

Hi @hanikevi,

In ros_adetector we calculate the loss fn with jnp.mean(jnp.abs(x,y)) and in the other plot I called ValModel.visualize_loss and we calculate the loss with the leaf_loss fn:

So even if the form is a bit different, the loss values should not be that different. In the ros_adetector version we get loss values around 3-4 times higher than the ones in validate_model. The plots of the forces and joints look equal. So the reconstruction is probably different.

I played the rosbag with rate -0.3 and the loss fn changed. The values are not that high anymore but still greater than the ones I got in the offline validation. I added the plots from plotjuggler in the word doc.

ok good loss fn were comparing is abt the same. If play speed changed things then the network eval is prob slower than the sample rate of the topic used for the callback. 0.3 saturates the change or does it change more when you go to 0.05 or sth? I’d also suspect the data formatting. Could probably println the first network input for offline/online and make sure theyre the same.

marked the checklist item verifizieren, ob richtiges Modell, as completed

marked the checklist item loss_fn checken im ros_adetector (eher nicht), as completed

Hi guys, the sync_stream is different between what's used in real_data_gen.py and ros_adetector.py.

force is the slower topic (220 Hz vs 500 for joint_states), so we should update ros_adetector. Should help, but I'm not sure this is the only bug.

This consistency should be enforced structurally. I see two options:

Add a flag to the data_streams init argument to ros_datahandler which allows us to mark which one is the sync topic. We would drop sync_stream from the init args, but requires further specialization of the data_streams.
We add a config in encoder_dynamics which specifies which dataclass and also the sync topic.

Both are sensible, and when we use this on other systems, we will want to be passing, e.g. FrankaData as an argument, not hard-coded, so the 2nd one will be needed in any case. Thoughts?

Hi @hanikevi,

I will change the sync_stream and the args of the subscriber_factory fn in ros_adetector.py and test again.

Right after training we save the model and relevant information into a dill file. Maybe we could specify the dataclass and sync_topic at this point?

Sounds good. The dill file would then serialize all of the configuration and allow us to load from a file, right? I would suggest making a separate issue and setting up a data type for what we serialize, so that it's documented and can be extended in the future more easily, but you and Niki can discuss

So I changed sync_stream and the args but there is no difference. I checked the net input and it is the same as data. But I noticed that we have an offset in the published data and reconstructed data. Maybe our autoencoder is reconstructing the data with a offset?

First one with 0.01 rate and second is normal play rate. Both show a cutout.

That means with the updated sync_topic to force we still have differences in form and scaling between offline and online evaluation?

For the delay, I had also seen this effect online: when pushing on the robot, the reconstructed raise to match it shortly later. I think the network learns to use the last few forces to predict the next forces, which would act like a bit of a delay on the signal. I would also keep this in mind, but I think it's a separate issue.

Hey @hanikevi,

yes we still have slight differences, more in scaling than in form I'd say.

Concerning your 2nd aspect, does this behaviour could come from too simple net-architecture? If I got it right, my fist step would be to modify our model and increase the latent dim to have more parameters and thus more complexity.

That there's differences in scaling is still weird and might be a hint to some difference in the offline and online data processing. Are you investigating further or just moving forward with scaling the threshold for online eval?

For the 2d aspect, I'm not sure that this is a problem. It can be a property of the network architecture and training loss. With a low-dimensional signal (i.e. not pictures) and a big latent space, the network could learn to just pass the signal through the latent space and therefore get 'perfect' reconstruction. The original understanding of VAEs is that they have a bottleneck which forces compression of the high-dim input. But I would suggest to make any decisions on architecture based on the validation results

marked the checklist item verifizieren des Netz-Input (könnte sein, dass Dimensionen aut. angepasst werden) und publishen, as completed

marked the checklist item Laufzeit ros_adetector und rosbag play prüfen (bag langsam abspielen), as completed

changed the description

Lisa and I discussed in the lab a bit today. One idea is to re-publish the network inputs in the callback function, in the same way that the network outputs are published. This way we can see if there's any formatting errors or delays in ROS system.

Ok so we got some new developments @nik20652, @hanikevi:

So as Kevin suggested I published the network inputs again (F_comp) and at first sight there was no major difference. But if we zoom in we get this picture:

I reduced the play rate to 0.5 and we get

Now with play rate 0.1

We saw already that with a slow play rate the reconstruction is almost like the one we get in offline and the loss is lower. So it seems like we can not process all data points in data.get_data() if we play the rosbag with the normal play rate? It seems like the model replies to this issue by generating an offset.

changed the description

marked the checklist item compare net input to rosbag data as completed

marked the checklist item play rosbag with 0,05 and plot longer window as completed

Hey everyone, since all checkpoints have been completed, the issue will be closed. The latest results indicate successful reconstructions, with no significant loss observed in anomalous cases. Further work on this will take place in #59 (closed) .

closed

mentioned in merge request !44 (merged)

mentioned in merge request !46 (merged)

Check for reliable results

Designs

Child items ...

Activity