Array serialization format identifier for NDArray fields in sample schemas. Known values correspond to token definitions in this Lexicon. Each format has versioned specifications maintained by alt.science at canonical URLs.
ConstraintsmaxLength: 50 bytes
ndarrayBytes sparseBytes structuredBytes arrowTensor safetensors
View raw schema
{
"type": "string",
"maxLength": 50,
"description": "Array serialization format identifier for NDArray fields in sample schemas. Known values correspond to token definitions in this Lexicon. Each format has versioned specifications maintained by alt.science at canonical URLs.",
"knownValues": [
"ndarrayBytes",
"sparseBytes",
"structuredBytes",
"arrowTensor",
"safetensors"
]
}
Arrow tensor format. Stores multi-dimensional arrays using Arrow's tensor IPC format. Versions maintained at https://json-schema.alt.science/atdata-arrow-tensor/{version}/
Referencescience.alt.dataset.arrayFormat#arrowTensor
Tokens have no data representation. Use the reference string as a value.
View raw schema
{
"type": "token",
"description": "Arrow tensor format. Stores multi-dimensional arrays using Arrow's tensor IPC format. Versions maintained at https://json-schema.alt.science/atdata-arrow-tensor/{version}/"
}
Numpy .npy binary format for NDArray serialization. Stores arrays with dtype and shape in binary header. Versions maintained at https://json-schema.alt.science/atdata-ndarray-bytes/{version}/
Referencescience.alt.dataset.arrayFormat#ndarrayBytes
Tokens have no data representation. Use the reference string as a value.
View raw schema
{
"type": "token",
"description": "Numpy .npy binary format for NDArray serialization. Stores arrays with dtype and shape in binary header. Versions maintained at https://json-schema.alt.science/atdata-ndarray-bytes/{version}/"
}
Safetensors format (HuggingFace). Stores ML tensors with safe, memory-mapped access. Versions maintained at https://json-schema.alt.science/atdata-safetensors/{version}/
Referencescience.alt.dataset.arrayFormat#safetensors
Tokens have no data representation. Use the reference string as a value.
View raw schema
{
"type": "token",
"description": "Safetensors format (HuggingFace). Stores ML tensors with safe, memory-mapped access. Versions maintained at https://json-schema.alt.science/atdata-safetensors/{version}/"
}
Scipy sparse matrix format (CSR/CSC/COO). Stores sparse matrices with indices and data arrays. Versions maintained at https://json-schema.alt.science/atdata-sparse-bytes/{version}/
Referencescience.alt.dataset.arrayFormat#sparseBytes
Tokens have no data representation. Use the reference string as a value.
View raw schema
{
"type": "token",
"description": "Scipy sparse matrix format (CSR/CSC/COO). Stores sparse matrices with indices and data arrays. Versions maintained at https://json-schema.alt.science/atdata-sparse-bytes/{version}/"
}
Numpy structured array format. Stores arrays with named, typed fields (compound dtypes). Versions maintained at https://json-schema.alt.science/atdata-structured-bytes/{version}/
Referencescience.alt.dataset.arrayFormat#structuredBytes
Tokens have no data representation. Use the reference string as a value.
View raw schema
{
"type": "token",
"description": "Numpy structured array format. Stores arrays with named, typed fields (compound dtypes). Versions maintained at https://json-schema.alt.science/atdata-structured-bytes/{version}/"
}