Main Content

h5create

Create HDF5 dataset

Description

example

h5create(filename,ds,sz) creates a dataset ds whose name includes its full location in the HDF5 file filename, and with a size specified by sz.

example

h5create(filename,ds,sz,Name=Value) specifies one or more optional name-value arguments.

For example, ChunkSize=[5 5] specifies 5-by-5 chunks of the dataset that can be stored individually in the HDF5 file.

Examples

collapse all

Create a fixed-size 100-by-200-by-300 dataset 'myDataset' whose full path is specified as '/g1/g2/myDataset'.

h5create('myfile.h5','/g1/g2/myDataset',[100 200 300])

Write data to 'myDataset'. Since the dimensions of 'myDataset' are fixed, the amount of data to be written to it must match its size.

mydata = ones(100,200,300);
h5write('myfile.h5','/g1/g2/myDataset',mydata)
h5disp('myfile.h5')
HDF5 myfile.h5 
Group '/' 
    Group '/g1' 
        Group '/g1/g2' 
            Dataset 'myDataset' 
                Size:  100x200x300
                MaxSize:  100x200x300
                Datatype:   H5T_IEEE_F64LE (double)
                ChunkSize:  []
                Filters:  none
                FillValue:  0.000000

Create a single-precision 1000-by-2000 dataset and apply the highest level of compression. Chunk storage must be used when applying HDF5 compression.

h5create('myfile.h5','/myDataset2',[1000 2000],'Datatype','single', ...
          'ChunkSize',[50 80],'Deflate',9)

Display the contents of the entire HDF5 file.

h5disp('myfile.h5')
HDF5 myfile.h5 
Group '/' 
    Dataset 'myDataset2' 
        Size:  1000x2000
        MaxSize:  1000x2000
        Datatype:   H5T_IEEE_F32LE (single)
        ChunkSize:  50x80
        Filters:  deflate(9)
        FillValue:  0.000000

Create a two-dimensional dataset '/myDataset3' that is unlimited along the second dimension. ChunkSize must be specified to set any dimension of the dataset to Inf.

h5create('myfile.h5','/myDataset3',[200 Inf],'ChunkSize',[20 20])

Write data to '/myDataset3'. You can write data of any size along the second dimension to '/myDataset3', since its second dimension is unlimited.

mydata = rand(200,500);
h5write('myfile.h5','/myDataset3',mydata,[1 1],[200 500])

Display the entire contents of the HDF5 file.

h5disp('myfile.h5')
HDF5 myfile.h5 
Group '/' 
    Dataset 'myDataset3' 
        Size:  200x500
        MaxSize:  200xInf
        Datatype:   H5T_IEEE_F64LE (double)
        ChunkSize:  20x20
        Filters:  none
        FillValue:  0.000000

Input Arguments

collapse all

File name, specified as a character vector or string scalar containing the name of an HDF5 file.

Depending on the location you are writing to, filename can take on one of these forms.

Location

Form

Current folder

To write to the current folder, specify the name of the file in filename.

Example: 'myFile.h5'

Other folders

To write to a folder different from the current folder, specify the full or relative path name in filename.

Example: 'C:\myFolder\myFile.h5'

Example: 'myFolder\myFile.h5'

Remote Location

To write to a remote location, filename must contain the full path of the file specified as a uniform resource locator (URL) of the form:

scheme_name://path_to_file/my_file.ext

Based on your remote location, scheme_name can be one of the values in this table.

Remote Locationscheme_name
Amazon S3™s3
Windows Azure® Blob Storagewasb, wasbs

For more information, see Work with Remote Data.

Example: 's3://bucketname/path_to_file/myFile.h5'

  • If filename does not already exist, h5create creates it.

  • If you specify an existing HDF5 file name and a new dataset name, then h5create will add the new dataset to the existing HDF5 file.

Dataset name, specified as a character vector or string scalar containing the full path name of the dataset to be created. If you specify intermediate groups in the dataset name and they did not previously exist, then h5create creates them.

Dataset size, specified as a row vector. To specify an unlimited dimension, specify the corresponding element of sz as Inf. In this case, you must also specify ChunkSize.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: h5create("myFile.h5","/dataset1",[1000 2000],ChunkSize=[50 80],CustomFilterID=307,CustomFilterParameters=6) creates the dataset dataset1 from the HDF5 file myFile.h5 using 50-by-80 chunks, the registered bzip2 filter (identifier 307), and a compression block size of 6.

Datatype of the dataset, specified as any of the following MATLAB® datatypes.

  • 'double'

  • 'single'

  • 'uint64'

  • 'int64'

  • 'uint32'

  • 'int32'

  • 'uint16'

  • 'int16'

  • 'uint8'

  • 'int8'

  • 'string'

Chunk size, specified as a row vector containing the dimensions of the chunk. The length of ChunkSize must equal the length of sz, and each entry of ChunkSize must be less than or equal to the corresponding entry of sz. If any element of sz is Inf, you must specify ChunkSize.

gzip compression level, specified as a numeric value between 0 and 9, where 0 is the lowest compression level and 9 is the highest.

Fill value for missing data in numeric datasets, specified as a numeric value.

32-bit Fletcher checksum filter, specified as a numeric or logical 1 (true) or 0 (false). A Fletcher checksum filter is designed to verify that the transferred data in a file is error-free.

Shuffle filter, specified as a numeric or logical 1 (true) or 0 (false). A shuffle filter is an algorithm designed to improve the compression ratio by rearranging the byte order of data stored in memory.

Text encoding, specified as one of these values:

  • 'UTF-8' — Represent characters using UTF-8 encoding.

  • 'system' — Represent characters as bytes using the system encoding (not recommended).

Filter identifier for the registered filter plugin assigned by The HDF Group, specified as a positive integer. For a list of registered filters, see the Filters page on The HDF Group website.

The default value for this argument means the data set does not use dynamically loaded filters for compression.

Data Types: double

Filter parameters for third-party filters, specified as a numeric scalar or row vector. If you specify the CustomFilterID without also specifying this argument, the h5create function passes an empty vector to the HDF5 library and the filter uses default parameters.

This name-value argument corresponds to the cd_values argument of the H5Pset_filter function in the HDF5 library.

Data Types: double

Limitations

  • h5create does not support creating files stored remotely in HDFS™.

More About

collapse all

Chunk Storage in HDF5

Chunk storage refers to a method of storing a dataset in memory by dividing it into smaller pieces of data known as chunks. Chunking a dataset can improve performance when operating on a subset of the dataset, since the chunks can be read and written to the HDF5 file individually.

Version History

Introduced in R2011a

expand all