Non-contiguous reads/writes for dense arrays

#1

I’m currently using HDF5 to store large amounts of data as dense 2D arrays, and am interested in TileDB.

One feature in HDF5 that is important for my use case is the ability to read non-contiguous slices and return a rectangular array (and similarly for writing data). E.g select rows 1-1000 and columns 1-5 and 106-110, and return a 1000x10 array. (Specifically for my use case the rows (time dimension) are always contiguous, but the columns are not). In HDF5 it’s a matter of adding multiple hyperslabs to a single selection.

I see Query.set_coordinates for unordered reads/writes. so could list all the individual coords of the points, but I assume this would be inefficient at scale.

Is there way to efficiently achieve this kind of operation in TileDB?

Thanks,
Justin

0 Likes

#2

Thanks for reaching out. We are currently working on that issue. We have a pending PR for the sparse case that will be merged soon. We will work on the dense case immediately after. This issue is being tracked here.

0 Likes