h5create

Create HDF5 dataset

Syntax

h5create(filename,ds,sz)

h5create(filename,ds,sz,Name=Value)

Description

h5create(filename,ds,sz) creates a dataset ds whose name includes its full location in the HDF5 file filename, and with a size specified by sz.

example

h5create(filename,ds,sz,Name=Value) specifies one or more optional name-value arguments.

For example, ChunkSize=[5 5] specifies 5-by-5 chunks of the dataset that can be stored individually in the HDF5 file.

Examples

collapse all

Create Fixed-Size Dataset

Open Live Script

Create a fixed-size 100-by-200-by-300 dataset 'myDataset' whose full path is specified as '/g1/g2/myDataset'.

h5create('myfile.h5','/g1/g2/myDataset',[100 200 300])

Write data to 'myDataset'. Since the dimensions of 'myDataset' are fixed, the amount of data to be written to it must match its size.

mydata = ones(100,200,300);
h5write('myfile.h5','/g1/g2/myDataset',mydata)
h5disp('myfile.h5')

HDF5 myfile.h5 
Group '/' 
    Group '/g1' 
        Group '/g1/g2' 
            Dataset 'myDataset' 
                Size:  100x200x300
                MaxSize:  100x200x300
                Datatype:   H5T_IEEE_F64LE (double)
                ChunkSize:  []
                Filters:  none
                FillValue:  0.000000

Create Dataset with Compression

Open Live Script

Create a single-precision 1000-by-2000 dataset and apply the highest level of compression. Chunk storage must be used when applying HDF5 compression.

h5create('myfile.h5','/myDataset2',[1000 2000],'Datatype','single', ...
          'ChunkSize',[50 80],'Deflate',9)

Display the contents of the entire HDF5 file.

h5disp('myfile.h5')

HDF5 myfile.h5 
Group '/' 
    Dataset 'myDataset2' 
        Size:  1000x2000
        MaxSize:  1000x2000
        Datatype:   H5T_IEEE_F32LE (single)
        ChunkSize:  50x80
        Filters:  deflate(9)
        FillValue:  0.000000

Create Dataset with Unlimited Dimension

Open Live Script

Create a two-dimensional dataset '/myDataset3' that is unlimited along the second dimension. ChunkSize must be specified to set any dimension of the dataset to Inf.

h5create('myfile.h5','/myDataset3',[200 Inf],'ChunkSize',[20 20])

Write data to '/myDataset3'. You can write data of any size along the second dimension to '/myDataset3', since its second dimension is unlimited.

mydata = rand(200,500);
h5write('myfile.h5','/myDataset3',mydata,[1 1],[200 500])

Display the entire contents of the HDF5 file.

h5disp('myfile.h5')

HDF5 myfile.h5 
Group '/' 
    Dataset 'myDataset3' 
        Size:  200x500
        MaxSize:  200xInf
        Datatype:   H5T_IEEE_F64LE (double)
        ChunkSize:  20x20
        Filters:  none
        FillValue:  0.000000

Input Arguments

collapse all

`filename` — File Name
character vector | string scalar

File name, specified as a character vector or string scalar containing the name of an HDF5 file.

Depending on the location you are writing to, filename can take on one of these forms.

Location

Form

Current folder

To write to the current folder, specify the name of the file in filename.

Example: 'myFile.h5'

Other folders

To write to a folder different from the current folder, specify the full or relative path name in filename.

Example: 'C:\myFolder\myFile.h5'

Example: 'myFolder\myFile.h5'

Remote Location

To write to a remote location, filename must contain the full path of the file specified as a uniform resource locator (URL) of the form:

scheme_name://path_to_file/my_file.ext

Based on your remote location, scheme_name can be one of the values in this table.

Remote Location	`scheme_name`
Amazon S3™	`s3`
Windows Azure^® Blob Storage	`wasb`, `wasbs`

For more information, see Work with Remote Data.

Example: 's3://bucketname/path_to_file/myFile.h5'

If filename does not already exist, h5create creates it.
If you specify an existing HDF5 file name and a new dataset name, then h5create will add the new dataset to the existing HDF5 file.

`ds` — Dataset name
character vector | string scalar

Dataset name, specified as a character vector or string scalar containing the full path name of the dataset to be created. If you specify intermediate groups in the dataset name and they did not previously exist, then h5create creates them.

`sz` — Dataset size
row vector

Dataset size, specified as a row vector. To specify an unlimited dimension, specify the corresponding element of sz as Inf. In this case, you must also specify ChunkSize.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: h5create("myFile.h5","/dataset1",[1000 2000],ChunkSize=[50 80],CustomFilterID=307,CustomFilterParameters=6) creates the dataset dataset1 from the HDF5 file myFile.h5 using 50-by-80 chunks, the registered bzip2 filter (identifier 307), and a compression block size of 6.

`Datatype` — Datatype
`'double'` (default) | `'single'` | `'uint64'` | `'uint32'` | `'uint16'` | `…`

Datatype of the dataset, specified as any of the following MATLAB^® datatypes.

'double'
'single'
'uint64'
'int64'
'uint32'
'int32'
'uint16'
'int16'
'uint8'
'int8'
'string'

`ChunkSize` — Chunk size
row vector

Chunk size, specified as a row vector containing the dimensions of the chunk. The length of ChunkSize must equal the length of sz, and each entry of ChunkSize must be less than or equal to the corresponding entry of sz. If any element of sz is Inf, you must specify ChunkSize.

`Deflate` — `gzip` compression level
0 (default) | numeric value

gzip compression level, specified as a numeric value between 0 and 9, where 0 is the lowest compression level and 9 is the highest.

`FillValue` — Fill value for missing data
`0` (default) | numeric value

Fill value for missing data in numeric datasets, specified as a numeric value.

`Fletcher32` — 32-bit Fletcher checksum filter
`false` or `0` (default) | `true` or `1`

32-bit Fletcher checksum filter, specified as a numeric or logical 1 (true) or 0 (false). A Fletcher checksum filter is designed to verify that the transferred data in a file is error-free.

`Shuffle` — Shuffle filter
`false` or `0` (default) | `true` or `1`

Shuffle filter, specified as a numeric or logical 1 (true) or 0 (false). A shuffle filter is an algorithm designed to improve the compression ratio by rearranging the byte order of data stored in memory.

`TextEncoding` — Text Encoding
`'UTF-8'` (default) | `'system'`

Text encoding, specified as one of these values:

'UTF-8' — Represent characters using UTF-8 encoding.
'system' — Represent characters as bytes using the system encoding (not recommended).

`CustomFilterID` — Filter identifier
`[]` (default) | positive integer

Filter identifier for the registered filter plugin assigned by The HDF Group, specified as a positive integer. For a list of registered filters, see the Filters page on The HDF Group website.

The default value for this argument means the data set does not use dynamically loaded filters for compression.

Data Types: double

`CustomFilterParameters` — Filter parameters
`[]` (default) | numeric scalar | numeric row vector

Filter parameters for third-party filters, specified as a numeric scalar or row vector. If you specify the CustomFilterID without also specifying this argument, the h5create function passes an empty vector to the HDF5 library and the filter uses default parameters.

This name-value argument corresponds to the cd_values argument of the H5Pset_filter function in the HDF5 library.

Data Types: double

Limitations

h5create does not support creating files stored remotely in HDFS™.

More About

collapse all

Chunk Storage in HDF5

Chunk storage refers to a method of storing a dataset in memory by dividing it into smaller pieces of data known as chunks. Chunking a dataset can improve performance when operating on a subset of the dataset, since the chunks can be read and written to the HDF5 file individually.

Version History

Introduced in R2011a

expand all

R2022a: Use dynamically loaded filters to create dataset

You can use the CustomFilterID and CustomFilterParameters name-value arguments to enable compression using dynamically loaded filters.

R2020b: Create HDF5 files at a remote location

You can create HDF5 files in remote locations, such as Amazon S3, Windows Azure Blob Storage, and HDFS.

R2020b: Create HDF5 files with Unicode names

You can create HDF5 files whose names are encoded as Unicode characters.

h5create

Syntax

Description

Examples

Create Fixed-Size Dataset

Create Dataset with Compression

Create Dataset with Unlimited Dimension

Input Arguments

`filename` — File Name
character vector | string scalar

`ds` — Dataset name
character vector | string scalar

`sz` — Dataset size
row vector

Name-Value Arguments

`Datatype` — Datatype
`'double'` (default) | `'single'` | `'uint64'` | `'uint32'` | `'uint16'` | `…`

`ChunkSize` — Chunk size
row vector

`Deflate` — `gzip` compression level
0 (default) | numeric value

`FillValue` — Fill value for missing data
`0` (default) | numeric value

`Fletcher32` — 32-bit Fletcher checksum filter
`false` or `0` (default) | `true` or `1`

`Shuffle` — Shuffle filter
`false` or `0` (default) | `true` or `1`

`TextEncoding` — Text Encoding
`'UTF-8'` (default) | `'system'`

`CustomFilterID` — Filter identifier
`[]` (default) | positive integer

`CustomFilterParameters` — Filter parameters
`[]` (default) | numeric scalar | numeric row vector

Limitations

More About

Chunk Storage in HDF5

Version History

R2022a: Use dynamically loaded filters to create dataset

R2020b: Create HDF5 files at a remote location

R2020b: Create HDF5 files with Unicode names

See Also

Topics

h5create

Syntax

Description

Examples

Create Fixed-Size Dataset

Create Dataset with Compression

Create Dataset with Unlimited Dimension

Input Arguments

filename — File Name character vector | string scalar

ds — Dataset name character vector | string scalar

sz — Dataset size row vector

Name-Value Arguments

Datatype — Datatype 'double' (default) | 'single' | 'uint64' | 'uint32' | 'uint16' | …

ChunkSize — Chunk size row vector

Deflate — gzip compression level 0 (default) | numeric value

FillValue — Fill value for missing data 0 (default) | numeric value

Fletcher32 — 32-bit Fletcher checksum filter false or 0 (default) | true or 1

Shuffle — Shuffle filter false or 0 (default) | true or 1

TextEncoding — Text Encoding 'UTF-8' (default) | 'system'

CustomFilterID — Filter identifier [] (default) | positive integer

CustomFilterParameters — Filter parameters [] (default) | numeric scalar | numeric row vector

Limitations

More About

Chunk Storage in HDF5

Version History

R2022a: Use dynamically loaded filters to create dataset

R2020b: Create HDF5 files at a remote location

R2020b: Create HDF5 files with Unicode names

See Also

Topics

`filename` — File Name
character vector | string scalar

`ds` — Dataset name
character vector | string scalar

`sz` — Dataset size
row vector

`Datatype` — Datatype
`'double'` (default) | `'single'` | `'uint64'` | `'uint32'` | `'uint16'` | `…`

`ChunkSize` — Chunk size
row vector

`Deflate` — `gzip` compression level
0 (default) | numeric value

`FillValue` — Fill value for missing data
`0` (default) | numeric value

`Fletcher32` — 32-bit Fletcher checksum filter
`false` or `0` (default) | `true` or `1`

`Shuffle` — Shuffle filter
`false` or `0` (default) | `true` or `1`

`TextEncoding` — Text Encoding
`'UTF-8'` (default) | `'system'`

`CustomFilterID` — Filter identifier
`[]` (default) | positive integer

`CustomFilterParameters` — Filter parameters
`[]` (default) | numeric scalar | numeric row vector