Skip to content

Conversation

@foreverallama
Copy link
Contributor

Part of a series of commits to support write operations for MatlabOpaque objects. For now I just rearranged the methods to make it easy for me to update the code.

I'll first add to MAT_subsys.jl before updating MAT_HDF and MAT_v5.

@foreverallama
Copy link
Contributor Author

CC @matthijscox

Added some methods to write subsystem data... WIP. Basically just translating Python code into Julia here, so would need a review.

I had to make some small adjustments to incorporate the write logic. I updated some metadata fields from Vector{UInt32} to Vector{Vector{UInt32}}. The main reason is that when serializing nested objects, full metadata for the outermost object is saved before saving metadata for any inner nested objects. The length of the metadata fields are variable, so instead of using a position-based update logic (where you might need to recalculate positions every time), I went with a mutable vector insertion instead. This results in the Vector{Vector{UInt32}} type. Since during load, we're only expecting one vector, I simply updated the loading logic to use metadata_field[1][idx] instead of metadata_field[idx].

I don't think performance would take a hit when splatting these finally as we don't expect too many objects in a MAT-file (after splat the total FileWrapper metadata length could be around thousands at best for really complex class definitions). There might be a better way to code the whole logic but until then this should work.

I also added a new object cache for saving that would key by id(MatlabOpaque), and renamed the existing one to load_object_cache.

A side note, I was able to figure out what the _c2 metadata field contains. It's something called a class alias. Accordingly I renamed the field mcos_class_alias_metadata. Essentially, this allows you to create an object instance using an alias. The _c2 metadata field contains the latest alias name, which is also saved in subsystem.mcos_names. However, the object itself is saved by the oldest alias which would be stored in the MatlabOpaque.class attribute. I don't think its super important to support this feature here, but we could consider adding a new attribute alias to MatlabOpaque if required.

@matthijscox
Copy link
Member

Yeah this whole class system is beyond me :)
I don't see any obvious issues with the code on first glance.

I do notice there are zero tests for the new code. I guess you cannot create a minimal write test yet?
I also don't know if we want to test any of the subsystem internals.

@foreverallama
Copy link
Contributor Author

Code's not complete yet,. Still working on it, trying to squeeze in a little when I'm free. No point in testing internals as the write tests would take care of it anyways

@foreverallama
Copy link
Contributor Author

Most of the code's ready. It's not working yet I'll debug soon and in the meantime add some sample tests

push!(fwrap_data, Any[])
append!(fwrap_data, subsys.prop_vals_saved)

empty_struct = MatlabStructArray([]) #! FIXME
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthijscox can you help me create a 1x0 empty struct here? I'm struggling with the syntax a bit.

The fields _c3 and prop_vals_defaults should contain a Nx1 cell array, and each cell contains a 1x0 empty struct with no fields. Here N = no. of unique classes in the file plus one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my phone right now, but can you try something like MatlabStructArray(String[], (1,0))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code, this might not yet be possible. I always estimate the dimensions from the first column, but if there are no columns... Might be we have to explicitly add the dimensions to the MatlabStructArray type, so that it also works without any columns.

Does the empty struct array have any special features inside the HDF5 format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far I've been able to perform RW operations between MAT.jl and matio with the current code, but unfortunately MATLAB doesn't seem to be able to load files saved by MAT.jl.

It's going to take a while to pinpoint the issue as there are a few differences in the final HDF5 file structure between MAT.jl and MATLAB. At this point I cannot say if its because of the issue with empty structs. Let the MatlabStructArray type be as is for now, until I can identify the problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some experimenting, I believe the issue is because of the struct field names. It looks like if the field name attribute exists, then MATLAB tries to access elements of that field during load, regardless of the struct being empty itself (this is only for object deserialization in subsystem where some kind of assignment operations are happening in the background, not regular struct loading)

I don't think there's any use case for empty structs from a user perspective, so there's no need to update the definition of MatlabStructArray itself. Instead, we can use some other kind of marker that we can detect and specifically handle the empty struct case with no fields. Can you push a fix for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we might need to support something similar for MatlabOpaque type with empty dimensions like 1x0 0x0 or 0x1. Is this possible currently? I've seen these kinds of objects in .fig files IIRC. It's not required for normal use, but MATLAB uses these placeholders in special types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to define a special EmptyStructArray type? Or also to create the write function for it? I don't know the HDF5 specs for such a type... is there an easy way to load an example fully empty (no columns, 1x0 matrices) struct array type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

easy way to load an example fully empty (no columns, 1x0 matrices) struct array type

Sorry, yeah that's a good point. I'll check this out first


# Write a struct from arrays of keys and values
function m_write(mfile::MatlabHDF5File, parent::HDF5Parent, name::String, k::Vector{String}, v::Vector)
if length(k) == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the ability to write empty dicts as 0x0 structs over here. The subsystem data works with 0x0 structs as well in MATLAB.

This eliminates the need for any new struct markers for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. Are 0x0 structs also read as empty dicts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, all of 1x0, 0x1 and 0x0 are read as empty dicts already. But I hardcoded 0x0 for empty dict during write.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this might go against the point raised in #214 where an empty dict is being written as a 1x1 struct with no fields. Instead I guess we could either:

  • Write empty dicts as 1x1 structs with no fields. For 1x0, 0x0 dims we could maybe use a special key __dims__ that contains a tuple with the dimensions. I think its very unlikely to require reading 1x0, 0x0 structs (without fields), we just need write functionality
  • Create a new marker for empty structs with no fields
  • Update MatlabStructArray to handle empty structs

What do you suggest?

@foreverallama
Copy link
Contributor Author

It appears to be working now. I tried a minimal test below and it can be read in all of MAT.jl, matio and MATLAB.

sample_obj = MatlabOpaque(Dict(
    "a" => reshape([10.0], 1, 1),
    "b" => reshape([1.0, 2.0, 3.0], 3, 1)
), "TestClass")

sample_dict = Dict(
    "var" => sample_obj,
)

MAT.matwrite("test.mat", sample_dict)

I'll now focus on some simple tests covering user-defined objects and handle class objects. I still need to add support for object arrays. If I've missed anything else let me know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants