pub struct Repository<ObjectID: FsVerityHashValue> {
repository: OwnedFd,
objects: OnceCell<OwnedFd>,
write_semaphore: OnceCell<Arc<Semaphore>>,
insecure: bool,
_data: PhantomData<ObjectID>,
}Expand description
A content-addressable repository for composefs objects.
Stores content-addressed objects, splitstreams, and images with fsverity verification. Objects are stored by their fsverity digest, streams by SHA256 content hash, and both support named references for persistence across garbage collection.
Fields§
§repository: OwnedFd§objects: OnceCell<OwnedFd>§write_semaphore: OnceCell<Arc<Semaphore>>§insecure: bool§_data: PhantomData<ObjectID>Implementations§
Source§impl<ObjectID: FsVerityHashValue> Repository<ObjectID>
impl<ObjectID: FsVerityHashValue> Repository<ObjectID>
Sourcepub fn objects_dir(&self) -> ErrnoResult<&OwnedFd>
pub fn objects_dir(&self) -> ErrnoResult<&OwnedFd>
Return the objects directory.
Sourcepub fn write_semaphore(&self) -> Arc<Semaphore>
pub fn write_semaphore(&self) -> Arc<Semaphore>
Return a shared semaphore for limiting concurrent object writes.
This semaphore is lazily initialized with available_parallelism() permits,
and shared across all operations on this repository. Use this to limit
concurrent I/O when processing multiple files or layers in parallel.
Sourcepub fn open_path(dirfd: impl AsFd, path: impl AsRef<Path>) -> Result<Self>
pub fn open_path(dirfd: impl AsFd, path: impl AsRef<Path>) -> Result<Self>
Open a repository at the target directory and path.
Sourcepub fn open_system() -> Result<Self>
pub fn open_system() -> Result<Self>
Open the default system-global composefs repository.
fn ensure_dir(&self, dir: impl AsRef<Path>) -> ErrnoResult<()>
Sourcepub async fn ensure_object_async(
self: &Arc<Self>,
data: Vec<u8>,
) -> Result<ObjectID>
pub async fn ensure_object_async( self: &Arc<Self>, data: Vec<u8>, ) -> Result<ObjectID>
Asynchronously ensures an object exists in the repository.
Same as ensure_object but runs the operation on a blocking thread pool
to avoid blocking async tasks. Returns the fsverity digest of the object.
For performance reasons, this function does not call fsync() or similar. After you’re
done with everything, call Repository::sync_async().
Sourcepub fn create_object_tmpfile(&self) -> Result<OwnedFd>
pub fn create_object_tmpfile(&self) -> Result<OwnedFd>
Create an O_TMPFILE in the objects directory for streaming writes.
Returns the file descriptor for writing. The caller should write data to this fd,
then call spawn_finalize_object_tmpfile() to compute the verity digest,
enable fs-verity, and link the file into the objects directory.
Sourcepub fn spawn_finalize_object_tmpfile(
self: &Arc<Self>,
tmpfile_fd: OwnedFd,
size: u64,
) -> JoinHandle<Result<ObjectID>>
pub fn spawn_finalize_object_tmpfile( self: &Arc<Self>, tmpfile_fd: OwnedFd, size: u64, ) -> JoinHandle<Result<ObjectID>>
Spawn a background task that finalizes a tmpfile as an object.
The task computes the fs-verity digest by reading the file, enables verity, and links the file into the objects directory.
Returns a handle that resolves to the ObjectID (fs-verity digest).
§Arguments
tmpfile_fd- The O_TMPFILE file descriptor with data already writtensize- The exact size in bytes of the data written to the tmpfile
Sourcepub fn finalize_object_tmpfile(&self, file: File, size: u64) -> Result<ObjectID>
pub fn finalize_object_tmpfile(&self, file: File, size: u64) -> Result<ObjectID>
Finalize a tmpfile as an object.
This method should be called from a blocking context (e.g., spawn_blocking)
as it performs synchronous I/O operations.
This method:
- Re-opens the file as read-only
- Enables fs-verity on the file (kernel computes digest)
- Reads the digest from the kernel
- Checks if object already exists (deduplication)
- Links the file into the objects directory
By letting the kernel compute the digest during verity enable, we avoid reading the file an extra time in userspace.
Sourcefn compute_verity_digest(reader: &mut impl BufRead) -> Result<ObjectID>
fn compute_verity_digest(reader: &mut impl BufRead) -> Result<ObjectID>
Compute fs-verity digest in userspace by reading from a buffered source. Used as fallback when kernel verity is not available (insecure mode).
Sourcefn store_object_with_id(&self, data: &[u8], id: &ObjectID) -> Result<()>
fn store_object_with_id(&self, data: &[u8], id: &ObjectID) -> Result<()>
Store an object with a pre-computed fs-verity ID.
This is an internal helper that stores data assuming the caller has already computed the correct fs-verity digest. The digest is verified after storage.
Sourcepub fn ensure_object(&self, data: &[u8]) -> Result<ObjectID>
pub fn ensure_object(&self, data: &[u8]) -> Result<ObjectID>
Given a blob of data, store it in the repository.
For performance reasons, this function does not call fsync() or similar. After you’re
done with everything, call Repository::sync().
fn open_with_verity( &self, filename: &str, expected_verity: &ObjectID, ) -> Result<OwnedFd>
Sourcepub fn set_insecure(&mut self, insecure: bool) -> &mut Self
pub fn set_insecure(&mut self, insecure: bool) -> &mut Self
By default fsverity is required to be enabled on the target
filesystem. Setting this disables verification of digests
and an instance of Self can be used on a filesystem
without fsverity support.
Sourcepub fn create_stream(
self: &Arc<Self>,
content_type: u64,
) -> SplitStreamWriter<ObjectID>
pub fn create_stream( self: &Arc<Self>, content_type: u64, ) -> SplitStreamWriter<ObjectID>
Creates a SplitStreamWriter for writing a split stream. You should write the data to the returned object and then pass it to .store_stream() to store the result.
fn format_object_path(id: &ObjectID) -> String
fn format_stream_path(content_identifier: &str) -> String
Sourcepub fn has_stream(&self, content_identifier: &str) -> Result<Option<ObjectID>>
pub fn has_stream(&self, content_identifier: &str) -> Result<Option<ObjectID>>
Check if the provided splitstream is present in the repository; if so, return its fsverity digest.
Sourcepub fn write_stream(
&self,
writer: SplitStreamWriter<ObjectID>,
content_identifier: &str,
reference: Option<&str>,
) -> Result<ObjectID>
pub fn write_stream( &self, writer: SplitStreamWriter<ObjectID>, content_identifier: &str, reference: Option<&str>, ) -> Result<ObjectID>
Write the given splitstream to the repository with the provided content identifier and optional reference name.
This call contains an internal barrier that guarantees that, in event of a crash, either:
- the named stream (by
content_identifier) will not be available; or - the stream and all of its linked data will be available
In other words: it will not be possible to boot a system which contained a stream named
content_identifier but is missing linked streams or objects from that stream.
Sourcepub async fn register_stream(
self: &Arc<Self>,
object_id: &ObjectID,
content_identifier: &str,
reference: Option<&str>,
) -> Result<()>
pub async fn register_stream( self: &Arc<Self>, object_id: &ObjectID, content_identifier: &str, reference: Option<&str>, ) -> Result<()>
Register an already-stored object as a named stream.
This is useful when using SplitStreamBuilder which stores the splitstream
directly via finish(). After calling finish(), call this method to
sync all data to disk and create the stream symlink.
This method ensures atomicity: the stream symlink is only created after all objects have been synced to disk.
Sourcepub async fn write_stream_async(
self: &Arc<Self>,
writer: SplitStreamWriter<ObjectID>,
content_identifier: &str,
reference: Option<&str>,
) -> Result<ObjectID>
pub async fn write_stream_async( self: &Arc<Self>, writer: SplitStreamWriter<ObjectID>, content_identifier: &str, reference: Option<&str>, ) -> Result<ObjectID>
Async version of write_stream for use with parallel object storage.
This method awaits any pending parallel object storage tasks before
finalizing the stream. Use this when you’ve called write_external_parallel()
on the writer.
Sourcepub fn has_named_stream(&self, name: &str) -> Result<bool>
pub fn has_named_stream(&self, name: &str) -> Result<bool>
Check if a splitstream with a given name exists in the “refs” in the repository.
Sourcepub fn name_stream(&self, content_identifier: &str, name: &str) -> Result<()>
pub fn name_stream(&self, content_identifier: &str, name: &str) -> Result<()>
Assign the given name to a stream. The stream must already exist. After this operation it will be possible to refer to the stream by its new name ‘refs/{name}’.
Sourcepub fn ensure_stream(
self: &Arc<Self>,
content_identifier: &str,
content_type: u64,
callback: impl FnOnce(&mut SplitStreamWriter<ObjectID>) -> Result<()>,
reference: Option<&str>,
) -> Result<ObjectID>
pub fn ensure_stream( self: &Arc<Self>, content_identifier: &str, content_type: u64, callback: impl FnOnce(&mut SplitStreamWriter<ObjectID>) -> Result<()>, reference: Option<&str>, ) -> Result<ObjectID>
Ensures that the stream with a given content identifier digest exists in the repository.
This tries to find the stream by the content identifier. If the stream is already in the
repository, the object ID (fs-verity digest) is read from the symlink. If the stream is
not already in the repository, a SplitStreamWriter is created and passed to callback.
On return, the object ID of the stream will be calculated and it will be written to disk
(if it wasn’t already created by someone else in the meantime).
In both cases, if reference is provided, it is used to provide a fixed name for the
object. Any object that doesn’t have a fixed reference to it is subject to garbage
collection. It is an error if this reference already exists.
On success, the object ID of the new object is returned. It is expected that this object ID will be used when referring to the stream from other linked streams.
Sourcepub fn open_stream(
&self,
content_identifier: &str,
verity: Option<&ObjectID>,
expected_content_type: Option<u64>,
) -> Result<SplitStreamReader<ObjectID>>
pub fn open_stream( &self, content_identifier: &str, verity: Option<&ObjectID>, expected_content_type: Option<u64>, ) -> Result<SplitStreamReader<ObjectID>>
Open a splitstream with the given name.
Sourcepub fn open_object(&self, id: &ObjectID) -> Result<OwnedFd>
pub fn open_object(&self, id: &ObjectID) -> Result<OwnedFd>
Given an object identifier (a digest), return a read-only file descriptor
for its contents. The fsverity digest is verified (if the repository is not in insecure mode).
Sourcepub fn read_object(&self, id: &ObjectID) -> Result<Vec<u8>>
pub fn read_object(&self, id: &ObjectID) -> Result<Vec<u8>>
Read the contents of an object into a Vec
Sourcepub fn merge_splitstream(
&self,
content_identifier: &str,
verity: Option<&ObjectID>,
expected_content_type: Option<u64>,
output: &mut impl Write,
) -> Result<()>
pub fn merge_splitstream( &self, content_identifier: &str, verity: Option<&ObjectID>, expected_content_type: Option<u64>, output: &mut impl Write, ) -> Result<()>
Merges a splitstream into a single continuous stream.
Opens the named splitstream, resolves all object references, and writes the complete merged content to the provided writer. Optionally verifies the splitstream’s fsverity digest matches the expected value.
Sourcepub fn write_image(&self, name: Option<&str>, data: &[u8]) -> Result<ObjectID>
pub fn write_image(&self, name: Option<&str>, data: &[u8]) -> Result<ObjectID>
Write data into the repository as an image with the given name`.
The fsverity digest is returned.
§Integrity
This function is not safe for untrusted users.
Sourcepub fn import_image<R: Read>(
&self,
name: &str,
image: &mut R,
) -> Result<ObjectID>
pub fn import_image<R: Read>( &self, name: &str, image: &mut R, ) -> Result<ObjectID>
Import the data from the provided read into the repository as an image.
The fsverity digest is returned.
§Integrity
This function is not safe for untrusted users.
Sourcefn open_image(&self, name: &str) -> Result<(OwnedFd, bool)>
fn open_image(&self, name: &str) -> Result<(OwnedFd, bool)>
Returns the fd of the image and whether or not verity should be enabled when mounting it.
Sourcepub fn mount(&self, name: &str) -> Result<OwnedFd>
pub fn mount(&self, name: &str) -> Result<OwnedFd>
Create a detached mount of an image. This file descriptor can then
be attached via e.g. move_mount.
Sourcepub fn mount_at(&self, name: &str, mountpoint: impl AsRef<Path>) -> Result<()>
pub fn mount_at(&self, name: &str, mountpoint: impl AsRef<Path>) -> Result<()>
Mount the image with the provided digest at the target path.
Sourcepub fn symlink(
&self,
name: impl AsRef<Path>,
target: impl AsRef<Path>,
) -> ErrnoResult<()>
pub fn symlink( &self, name: impl AsRef<Path>, target: impl AsRef<Path>, ) -> ErrnoResult<()>
Creates a relative symlink within the repository.
Computes the correct relative path from the symlink location to the target, creating any necessary intermediate directories. Atomically replaces any existing symlink at the specified name.
fn read_symlink_hashvalue(dirfd: &OwnedFd, name: &CStr) -> Result<ObjectID>
fn walk_symlinkdir(fd: OwnedFd, objects: &mut HashSet<ObjectID>) -> Result<()>
Sourcefn openat(&self, name: &str, flags: OFlags) -> ErrnoResult<OwnedFd>
fn openat(&self, name: &str, flags: OFlags) -> ErrnoResult<OwnedFd>
Open the provided path in the repository.
fn gc_category(&self, category: &str) -> Result<HashSet<ObjectID>>
Sourcepub fn objects_for_image(&self, name: &str) -> Result<HashSet<ObjectID>>
pub fn objects_for_image(&self, name: &str) -> Result<HashSet<ObjectID>>
Given an image, return the set of all objects referenced by it.
Sourcepub fn sync(&self) -> Result<()>
pub fn sync(&self) -> Result<()>
Makes sure all content is written to the repository.
This is currently just syncfs() on the repository’s root directory because we don’t have any better options at present. This blocks until the data is written out.
Sourcepub async fn sync_async(self: &Arc<Self>) -> Result<()>
pub async fn sync_async(self: &Arc<Self>) -> Result<()>
Makes sure all content is written to the repository.
This is currently just syncfs() on the repository’s root directory because we don’t have any better options at present. This won’t return until the data is written out.