How To Load A Layer From Checkpoint

August 20, 2024 Post a Comment

I have this config: network = {'source_embed_raw': {'class': 'linear', ...}} I want to load the params for layer source_embed_raw from some existing checkpoint. In that checkpoint

Solution 1:

This is currently not possible with preload_from_files in this way. So I currently see these possible options:

We could extend the logic of preload_from_files (and CustomCheckpointLoader) to allow for sth like that (some generic variable/layer name mapping).
Or you could rename your layer from source_embed_raw to e.g. old_model__target_embed_raw and then use preload_from_files with the prefix option. If you do not want to rename it, you could still add a layer like old_model__target_embed_raw and then use parameter sharing in source_embed_raw.
If the parameter in the checkpoint is actually called sth like output/rec/target_embed_raw/..., you could create a SubnetworkLayer named old_model__output, in that another SubnetworkLayer with name rec, and in that a layer named target_embed_raw.
You could write a script to simply load the existing checkpoint, and store is as a new checkpoint but with renamed variable names (this is also totally independent from RETURNN).
LinearLayer (and most other layers) allows to specify exactly how the parameters are initialized (forward_weights_init and bias_init). The parameter initialization is quite flexible. E.g. there is sth like load_txt_file_initializer which can be used. Currently there is no such function to directly load it from an existing checkpoint but we could add that. Or you could simply implement the logic inside your config (it will only be sth like 5 lines of code or so).
Instead of using preload_from_files, you could also use SubnetworkLayer and the load_on_init option. And then a similar logic as in option 2.

Python Programming Language

How To Load A Layer From Checkpoint

Solution 1:

Post a Comment for "How To Load A Layer From Checkpoint"