The AI sensor processing SDK for low latency streaming workflows
v2.1.0-dgpu
and v2.1.0-igpu
pip install holoscan==2.1.0
2.1.0.1-1
See supported platforms for compatibility.
A report with execution time statistics for individual operators can now be enabled. This report will contain information like median, 90th percentile and maximum times for operator execution. Setting environment variable HOLOSCAN_ENABLE_GXF_JOB_STATISTICS=true
enables this report (it is disabled by default as statistics collection may introduce a minor performance overhead). For more details see the documentation on the feature](https://docs.nvidia.com/holoscan/sdk-user-guide/gxf_job_statistics.html).
The holoscan.Tensor
object's data
property in the Python API now returns an integer (pointer address) instead of a NULL PyCapsule object, potentially avoiding confusion about data availability. Users can confirm the presence of data via the __array_interface__
or __cuda_array_interface__
properties. This change allows for direct access to the data pointer, facilitating debugging and performance optimization.
The string representation of the IOSpec
object, generated by IOSpec::to_yaml_node()
, includes ConditionType
information in the type
field. It correctly displays kNone
when no condition (ConditionType::kNone
in C++ and ConditionType.NONE
in Python) is explicitly set.
name: receiver
io_type: kInput
typeinfo_name: N8holoscan3gxf6EntityE
connector_type: kDefault
conditions:
- type: kNone
Enhanced the macros (HOLOSCAN_CONDITION_FORWARD_TEMPLATE
, HOLOSCAN_RESOURCE_FORWARD_TEMPLATE
, HOLOSCAN_OPERATOR_FORWARD_TEMPLATE
, etc.) by using the full namespace of the classes, improving their robustness and adaptability across different namespace scopes.
Updated the holoscan.core.py_object_to_arg()
method to allow conversion of Python objects to Arg
objects using YAML::Node
. This resolves type mismatches, such as when the underlying C++ parameter expects an int32_t type but Python uses int64_t.
The Python OperatorSpec/ComponentSpec class exposes the inputs and outputs properties, providing direct access to the input and output IO specs. This enhancement simplifies the process of setting conditions on inputs and outputs.
def setup(self, spec: OperatorSpec):
spec.input("data")
# Set the NONE condition to the input port named `data`.
spec.inputs["data"].condition(ConditionType.NONE)
print(spec.inputs["data"])
Workflows where an operator connects to multiple downstream operators within the same fragment may see a minor performance boost. This is because of an internal refactoring in how connections between operators are made. Previously a GXF broadcast codelet was automatically inserted into the graph behind the scenes to broadcast the output to multiple receivers. As of this release, direct 1:N connection from an output port is made without the framework needing to insert this extra codelet to enable this.
fmt::format
support for printing the Parameter
class has been added (there is no longer a need to call the get()
method to print out the contained value). This allows parameter values to be directly printed in HOLOSCAN_LOG_*
statements. For example :
MetaParameter p = MetaParameter<int>(5);
HOLOSCAN_LOG_INFO("Formatted parameter value: {}", p);
// can also pass parameter to fmt::format
std::string format_message = fmt::format("{}", p);
Most built-in operators now do additional validation of input tensors and will raise more helpful messages if the dimensions, data type or memory layout of the provided tensors is not as expected. Remaining operators (InferenceOp
, InferenceProcessorOp
) will updated in the next release.
BayerDemosaicOp
and FormatConverterOp
will now automatically perform host->device copies if needed for either nvidia::gxf::VideoBuffer
or Tensor
inputs. Previously these operators only did the transfer automatically for nvidia::gxf::VideoBuffer
, but not for Tensor
and in the case of FormatConverterOp
that transfer was only automatically done for pinned host memory. As of this release both operators will only copy unpinned system memory, leaving pinned host memory as-is.
When creating Python bindings for C++ operators, it is now possible to register custom type conversion functions for user defined C++ types. These handle conversion to and from a corresponding Python type. See the newly expanded section on creating Python bindings for C++ operators for details.
As of this release, all provided Python operators support passing conditions such as CountCondition
or PeriodicCondition
as positional arguments. In previous releases, there was a limitation that Python operators that wrapped an underlying C++ operator did not support this. As a concrete example, one could now pass a CountCondition
to limit the number of frames the visualization operator will run for.
holoviz = HolovizOp(
self,
# add count condition to stop the application after short duration (i.e. for testing)
CountCondition(self, count),
name="holoviz",
**self.kwargs("holoviz"),
)
The AJA NTV2 dependency, and the corresponding AJA Source Operator, have been updated to use the latest official AJA NTV2 17.0.1 release. This new NTV2 version also introduces support for the KONA XM hardware.
The Holoviz operator now supports setting the camera for layers rendered in 3d (geometry layer with 3d primitives and depth map layer). The camera eye, look at and up vectors can be initialized using parameters or dynamically changed at runtime by providing data at the respective input channels. More information can be found in the documentation. There is also a new C++ example holoviz_camera.cpp.
The Holoviz operator now supports different types of camera pose outputs. In additional to the 4x4 row major projection matrix, a camera extrinsics model of type nvidia::gxf::Pose3D
can now also be output. The output type is selected by setting the camera_pose_output_type
parameter.
The Holoviz operator now supports Wayland. Also the run launch
command has been updated to support Wayland.
The inference operator (InferenceOp
) now supports a new optional parameter, temporal_map
, which can be used to specify a frame interval at which inference will be run. For example, setting a value of 10 for a given model will result in inference only being run on every 10th frame. Intermediate frames will output the result from the most recent frame at which inference was run. The interval value is specified per-model, allowing different inference models to be run at different rates.
The existing asynchronous scheduling condition is now also available from Python (via holoscan.conditions.AsynchronousCondition
). For an example of usage, see the new asynchronous ping example.
We introduce the GXFCodeletOp
and GXFComponentResource
classes, streamlining the import of GXF Codelets and Components into Holoscan applications. These additions simplify the setup process, allowing users to utilize custom GXF components more intuitively and efficiently.
auto tx = make_operator<ops::GXFCodeletOp>(
"tx",
"nvidia::gxf::test::SendTensor",
make_condition<CountCondition>(15),
Arg("pool") = make_resource<GXFComponentResource>(
"pool",
"nvidia::gxf::BlockMemoryPool",
Arg("storage_type") = static_cast<int32_t>(1),
Arg("block_size") = 1024UL,
Arg("num_blocks") = 2UL));
tx = GXFCodeletOp(
self,
"nvidia::gxf::test::SendTensor",
CountCondition(self, 15),
name="tx",
pool=GXFComponentResource(
self,
"nvidia::gxf::BlockMemoryPool",
name="pool",
storage_type=1,
block_size=1024,
num_blocks=2,
),
)
Please check out the examples in the examples/import_gxf_components directory for more information on how to use these new classes.
When calling op_output.emit
from the compute
method of a Python operator, it is now possible to provide a third argument that overrides the default choice of the type of object type emitted. This is sometimes needed to emit a certain C++ type from a native Python operator when connecting it to a different Python operator that wraps an underlying C++ operator. For example, one could emit a Python string as a C++ std::string
instead of the Python string object via op_output.emit(py_str, "port_name", "std::string")
. See additional examples and a table of the C++ types registered by default here.
The documentation strings of the built-in operators that support use of a BlockMemoryPool
allocator now have detailed descriptions of the necessary block size and number of blocks that will be needed. This information appears under the header "==Device Memory Requirements==".
DataExporter
and CsvDataExporter
) is now available. This API can be used to export Holoscan applications output to CSV files for Holoscan Federated Analytics applications. DataExporter
is a base class to support exporting Holoscan applications output in different formats. CsvDataExporter
is a class derived from DataExporter
to support exporting Holoscan applications output to CSV files.NVIDIA_DRIVER_CAPABILITIES=all
, removing the need to set it with docker run -e ...
. That value can still be overridden with manual -e NVIDIA_DRIVER_CAPABILITIES=...
.run
script now checks for X11 and Wayland and passes the according options to the docker
commandIssue | Description |
---|---|
4616519 | Resolved an issue where standalone fragments without UCX connections were not executed. The fix ensures the internal connection map is initialized for each fragment regardless of UCX connections, enhancing reliability and execution consistency. |
4616525 | Addressed a bug where the stop_on_deadlock parameter of the scheduler was not being correctly set to false via the HOLOSCAN_STOP_ON_DEADLOCK environment variable. This fix ensures that the boolean value is accurately set to false when the environment variable is assigned false-like values. |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
4062979 | When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered. |
4267272 | AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h . Expected to be addressed in IGX SW 1.0 GA. |
4384768 | No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively. |
4190019 | Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run . Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead. |
4210082 | v4l_camera example seg faults at exit. |
4339399 | High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5 ) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler. |
4318442 | UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable. |
4325468 | The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888. |
4325585 | Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms , particularly if check_recession_period_ms is greater than zero. |
4301203 | HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA. |
4384348 | UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages. |
4481171 | Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads. |
4458192 | In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777 ), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780 . |
Wayland: holoscan::viz::Init() with existing GLFW window fails. | |
4680791 | iGPU: H264 application in the dev container cost more than 1 hour to generate the engine file |
4680894 | AJA driver failed to build with IGX 1.0 GA and Jetpack 6.0 |
4667183 | Holoscan CLI: Failed to extract with Permission error |
4668978 | AJA: Holohub application disable RDMA will get crash in both container and deb |
4678092 | Holohub: Failed to build_and_run the volume_rendering_rx on IGX |
4678337 | v4l2 sample will crash with HDMI input |
v2.0.0-dgpu
and v2.0.0-igpu
pip install holoscan==2.0.0
2.0.0.2-1
See supported platforms for compatibility.
make_condition
, make_fragment
, make_network_context
, make_operator
, make_resource
, and
make_scheduler
now accept a non-const
string or character array for the name
parameter.EventBasedScheduler
) is available. It is an alternative to the existing, polling-based MultiThreadScheduler
and can be used as a drop-in replacement. The only difference in parameters is that it does not take check_recession_period_ms
parameter, as there is no such polling interval for this scheduler. It should give similar performance to the MultiThreadScheduler
with a very short polling interval, but without the high CPU usage seen for the multi-thread scheduler in that case (due to constant polling for work by one thread).Operator
methods start
, stop
or compute
, that exception will first trigger the underlying GXF scheduler to terminate the application graph and then the exception will be raised by Holoscan SDK. This resolves an issue with inconsistent behavior from Python and C++ apps on how exceptions were handled and fixes a crash in C++ apps when an operator raised an exception from the start
or stop
methods.run
method, allowing users to catch and manage exceptions within their
application.
Previously, the Holoscan runtime would catch and log exceptions, with the application continuing
to run (in Python) or exit (in C++) without a clear indication of the exception's origin.
Users can catch and manage exceptions by enclosing the run
method in a try
block.holoscan::Fragment::run_async
and holoscan.Application.run_async
methods
for C++ and Python, they return std::future
and concurrent.futures.Future
respectively.
The revised documentation advises using future.get()
in C++ and future.result()
in Python to
wait until the application has completed execution and to address any exceptions that occurred.exposure
and gain
values for cameras that support it.--gpus
argument to override the default values.The VideoStreamRecorderOp
and VideoStreamReplayerOp
now work without requiring the libgxf_stream_playback.so
extension. Now that the extension is unused, it has been removed from the SDK and should no longer be listed under the extensions
section of application YAML files using these operators.
As of version 2.0, we have removed certain Python bindings to align with the unified logger interface:
holoscan.logger.enable_backtrace()
holoscan.logger.disable_backtrace()
holoscan.logger.dump_backtrace()
holoscan.logger.should_backtrace()
holoscan.logger.flush()
holoscan.logger.flush_level()
holoscan.logger.flush_on()
HOLOSCAN_LOG_INFO
macro), and are not designed for Python's logging framework. Python API users are advised to utilize the standard logging
module for their logging needs:
holoscan.logger.LogLevel
holoscan.logger.log_level()
holoscan.logger.set_log_level()
holoscan.logger.set_log_pattern()
Several GXF headers have moved from gxf/std
to gxf/core
:
parameter_parser.hpp
parameter_parser_std.hpp
parameter_registrar.hpp
parameter_storage.hpp
parameter_wrapper.hpp
resource_manager.hpp
resource_registrar.hpp
type_registry.hpp
Some C++ code for tensor interoperability has been upstreamed from Holoscan SDK into GXF. The public holoscan::Tensor
class will remain, but there have been a small number of backward incompatible changes in related C++ classes and methods in this release. Most of these were used internally and are unlikely to affect existing applications.
holoscan::gxf::GXFTensor
and holoscan::gxf::GXFMemoryBuffer
have been removed. The DLPack functionality that was formerly in holoscan::gxf::GXFTensor
is now available upstream in GXF's nvidia::gxf::Tensor
.holoscan::gxf::DLManagedTensorCtx
has been renamed to holoscan::gxf::DLManagedTensorContext
and is now just an alias for nvidia::gxf::DLManagedTensorContext
. It also has two additional fields (dl_shape
and dl_strides
to hold shape/stride information used by DLPack).holoscan::gxf::DLManagedMemoryBuffer
is now an alias to nvidia::gxf::DLManagedMemoryBuffer
The GXF UCX extension, used in distributed applications, now sends data asynchronously by default, which can lead to issues such as insufficient memory on the transmitter side when a memory pool is used. Specifically, the concern is only for operators that have a memory pool and connect to an operator in a separate fragment of the distributed application. As a workaround, users can increase the num_blocks
parameter to a higher value in the BlockMemoryPool
or use the UnboundedAllocator
to avoid the problem. This issue will be addressed in a future release by providing a more robust solution to handle the asynchronous data transmission feature of the UCX extension, eliminating the need for manual intervention (see Known Issue 4601414).
For fragments using a BlockMemoryPool
, the num_blocks
parameter can be increased to a higher value to avoid the issue. For example, the following code snippet shows the existing BlockMemoryPool
resource being created with a higher number of blocks:
recorder_format_converter = make_operator<ops::FormatConverterOp>(
"recorder_format_converter",
from_config("recorder_format_converter"),
Arg("pool") =
//make_resource<BlockMemoryPool>("pool", 1, source_block_size, source_num_blocks));
make_resource<BlockMemoryPool>("pool", 1, source_block_size, source_num_blocks * 2));
source_pool_kwargs = dict(
storage_type=MemoryStorageType.DEVICE,
block_size=source_block_size,
#num_blocks=source_num_blocks,
num_blocks=source_num_blocks * 2,
)
recorder_format_converter = FormatConverterOp(
self,
name="recorder_format_converter",
pool=BlockMemoryPool(self, name="pool", **source_pool_kwargs),
**self.kwargs("recorder_format_converter"),
)
)
Since the underlying UCXTransmitter attempts to send the emitted data regardless of the status of the downstream Operator input port's message queue, simply doubling the num_blocks
may not suffice in cases where the receiver operator's processing time is slower than that of the sender operator.
If you encounter the issue, consider using the UnboundedAllocator
instead of the BlockMemoryPool
to avoid the problem. The UnboundedAllocator
does not have a fixed number of blocks and can allocate memory as needed, though it can cause some overhead due to the lack of a fixed memory pool size and may lead to memory exhaustion if the memory is not released in a timely manner.
The following code snippet shows how to use the UnboundedAllocator
:
...
Arg("pool") = make_resource<UnboundedAllocator>("pool");
from holoscan.resources import UnboundedAllocator
...
pool=UnboundedAllocator(self, name="pool"),
...
Issue | Description |
---|---|
4381269 | Fixed a bug that caused memory exhaustion when compiling the SDK in the VSCode Dev Container (using 'Tasks: Run Build Task') due to the missing CMAKE_BUILD_PARALLEL_LEVEL environment variable. Users can specify the number of jobs with the --parallel option (e.g., ./run vscode --parallel 16 ). |
4569102 | Fixed an issue where the log level was not updated from the environment variable when multiple Application classes were created during the session. Now, the log level setting in Application class allows for a reset from the environment variable if overridden. |
4578099 | Fixed a segfault in FormatConverterOp if used with a BlockMemoryPool with insufficient capacity to create the output tensor. |
4571581 | Fixed an issue where the documentation for the built-in operators was either missing or incorrectly rendered. |
4591763 | Application crashes if an exception is thrown from Operator::start or Operator::stop |
4595680 | Fixed an issue that caused the Inference operator to fail when multiple instances were composed in a single application graph. |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
4062979 | When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered. |
4267272 | AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h . Expected to be addressed in IGX SW 1.0 GA. |
4384768 | No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively. |
4190019 | Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run . Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead. |
4210082 | v4l_camera example seg faults at exit. |
4339399 | High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5 ) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler. |
4318442 | UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable. |
4325468 | The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888. |
4325585 | Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms , particularly if check_recession_period_ms is greater than zero. |
4301203 | HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA. |
4384348 | UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages. |
4481171 | Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads. |
4458192 | In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777 ), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780 . |
4601414 | The UCX extension's asynchronous data transmission feature causes a regression in the distributed application, such as insufficient memory on the transmitter side. As a workaround, users can increase the num_blocks parameter in the BlockMemoryPool or use the UnboundedAllocator instead of the BlockMemoryPool to avoid the issue. |
v1.0.3-dgpu
and v1.0.3-igpu
holoscan==1.0.3
1.0.3.2-1
(from the cuda repository)IOSpec::condition
.complex<float>
or complex<double>
. These parameters can either be parsed from a YAML config (e.g. using a string like "1.0 + 2.0j") or passed as a holoscan::Arg
to the operator constructor.complex<float>
or complex<double>
can now be used.description
methods and corresponding Python API __repr__
methods have been improved.
IOSpec
class now has a description
method and corresponding Python __repr__
method.Arg
class __repr__
could raise UnicodeDecodeError
for uint8_t or int8_t argument typesNetworkContext
and Scheduler
print more comprehensive information.__repr__
that makes use of the underlying C++ description methods.HOLOSCAN_UCX_PORTS
environment variable allows users to define preferred port numbers for the SDK's inter-node communication in a distributed application, especially in environments where specific ports need to be predetermined, such as Kubernetes.Condition
or Resource
class can be added to a Python operator after construction via its add_arg
method.HOLOSCAN_HEALTH_CHECK_PORT
environment variable allows users to define a port number for the SDK's health check endpoint in a distributed application.Application
or Fragments
's YAML configuration file can now be determined via a new config_keys()
method in the C++ API or config_keys
method from Python.compute
, initialize
, start
, and stop
methods of the Holoscan Operator were not compatible with Python tracing/profiling in earlier releases../run build_run_image
at the top of the repository, creating an image that is ~8.6 GB vs. the ~13 GB build container from ./run build_image
. (doc)run
script in the git repository had a couple of updates and improvements, including:
build
-> build-aarch64-dgpu
)./run help
and ./run <cmd> --help
for details)GLIBC_2.35
or above.Issue | Description |
---|---|
4185976 | Cycle in a graph is not supported. As a consequence, the endoscopy tool tracking example using input from an AJA video card in enabled overlay configuration is unfunctional. This is planned to be addressed in the next version of the SDK. |
4196152 | Getting "Unable to find component from the name ''" error message when using InferenceOp with Data Flow Tracking enabled. |
4211747 | Communication of GPU tensors between fragments in a distributed application can only use device 0 |
4212743 | Holoscan CLI packager copies into the App Package the unrelated files and folders in the same folder than the model file. |
4232453 | A segfault occurs if a native Python operator __init__ assigns a new attribute that overrides an existing base class attribute or method. A segfault will also occur if any exception is raised during Operator.__init__ or Application.__init__ before the parent class __init__ was called. |
4206197 | Distributed apps hang if multiple input/output ports are connected between two operators in different fragments. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
4187787 | TensorRT backend in the Inference operator prints Unknown embedded device detected. Using 52000MiB as the allocation cap for memory on embedded devices on IGX Orin (iGPU). Addressed in TensorRT 8.6+. |
4194109 | AppDriver is executing fragments' compose() method which can be avoided. |
4260969 | App add_flow causes issue if called more than once between a pair of operators. |
4265393 | Release 1.0-ea1 and 1.0-ea2 fail to run distributed applications with workers on two or more nodes. |
4272363 | A segfault may occur if an operator's output port containing GXF Tensor data is linked to multiple operators within the MultiThreadScheduler. |
4290043 | Bug in Python implicit broadcast of non-TensorMap types when at least one target operator is in a different fragment. |
4293729 | Python application using MultiThreadScheduler (including distributed application) may fail with GIL related error if SDK was compiled in debug mode. |
4101714 | Vulkan applications fail (vk::UnknownError ) in containers on iGPU due to missing iGPU device node being mounted in the container. Workaround documented in run instructions. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app on Clara AGX developer kits. Fix available in CUDA drivers 520. Workaround implemented since v0.4 to retry automatically. |
4293741 | Python application with more than two operators (mixed use of pure Python operator and operator wrapping C++ operator), using MultiThreadScheduler (including distributed app) and sending Python tensor can deadlock at runtime. |
4313690 | Failure to initialize BayerDemosaicOp in applications using the C++ API |
4187826 | Torch backend in the Inference operator is not supported on Tegra's integrated GPU. |
4336947 | The dev_id parameter of the CudaStreamPool resource is ignored. |
4344061 | Native Python operator overrides of the start, stop or initialize methods don't handle exceptions properly |
4344408 | The distributed application displays an error message if port 8777 is already in use. |
4363945 | Checking if a key exists in an application's config file results in an error being logged. |
Fixed bad cast exception when defining optional ports enablement (buffer input, output, camera pose) for the Holoviz operator from a YAML configuration file. | |
Fixed invalid stride alignment of video buffer inputs in the Holoviz operator. | |
4367627 | The distributed application does not handle IPv6 addresses and hostnames properly. |
4368977 | The DownstreamMessageAffordableCondition has not been added to the optional output ports of the Holoviz operator (AJASourceOp and HolovizOp). This omission leads to a GXF_EXCEEDING_PREALLOCATED_SIZE error when the data in the output port's queue is not consumed quickly enough. |
Fixed FormatConverterOp input stride handling, was previously ignored. | |
4381269 | Compiling the SDK with the VSCode Dev Container (using 'Tasks: Run Build Task') may lead to memory exhaustion due to the absence of the CMAKE_BUILD_PARALLEL_LEVEL environment variable. |
4398018 | Intermittent 'Deserialize entity header failed' error with the distributed app when running all fragments locally on the same node. |
4371324 | In the distributed application, a bug causes crashes ('Serialization failed') due to null tensor pointers during mixed broadcasts. This results from not using mutexes when sending GXF Tensors to remote endpoints with UCX. |
4414990 | Crash in Fragment::port_info with the distributed app if any of the fragments did not have any operators added to it during compose. |
4449149 | Unable to debug, trace, or profile the 'compute' method of Python API operators using VSCode debugger/profile/coverage. |
3878494 | Inference fails after a TensorRT engine file is first created using BlockMemoryPool . Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround. |
4171337 | AJA with RDMA is not working on integrated GPU (IGX or AGX Orin) due to conflicts between the nvidia-p2p and nvidia driver symbols (nvidia_p2p_dma_map_pages ) |
4233845 | The UCX Transmitter might select the wrong local IP address when creating a UCX client. This can cause the distributed application to fail if the selected IP address cannot be reached from the other computer. The Holoscan SDK automatically sets the HOLOSCAN_UCX_SOURCE_ADDRESS environment variable based on the --worker-address CLI argument if the worker address is a local IP address. In addition, the UCX_CM_USE_ALL_DEVICES environment variable is set to n by default to disable consideration of all devices for data transfer. |
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Platform | OS |
---|---|
NVIDIA IGX Orin | IGX SW 1.0 DP (L4T r36.1) or Meta Tegra Holoscan 1.0.0 (L4T r36.2) |
NVIDIA Jetson AGX Orin and Orin Nano | NVIDIA JetPack 6.0 DP (L4T r36.2) |
NVIDIA Clara AGX* *Only supporting the NGC container |
NVIDIA HoloPack 1.2 (L4T r34.1.2) or Meta Tegra Holoscan 0.6.0 (L4T r35.3.1) |
x86_64 platforms with Ampere GPU or above |
Ubuntu 22.04 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
4062979 | When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered. |
4267272 | AJA drivers cannot be built with RDMA on IGX SW 1.0 DP iGPU due to missing nv-p2p.h . Expected to be addressed in IGX SW 1.0 GA. |
4384768 | No RDMA support on JetPack 6.0 DP and IGX SW 1.0 DP iGPU due to missing nv-p2p kernel module. Expected to be addressed in JP 6.0 GA and IGX SW 1.0 GA respectively. |
4190019 | Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run . Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead. |
4210082 | v4l_camera example seg faults at exit. |
4339399 | High CPU usage observed with video_replayer_distributed application. While the high CPU usage associated with the GXF UCX extension has been fixed since v1.0, distributed applications using the MultiThreadScheduler (with the check_recession_period_ms parameter set to 0 by default) may still experience high CPU usage. Setting the HOLOSCAN_CHECK_RECESSION_PERIOD_MS environment variable to a value greater than 0 (e.g. 1.5 ) can help reduce CPU usage. However, this may result in increased latency for the application until the MultiThreadScheduler switches to an event-based multithreaded scheduler. |
4318442 | UCX cuda_ipc protocol doesn't work in Docker containers on x86_64. As a workaround, we are currently disabling the UCX cuda_ipc protocol on all platforms via the UCX_TLS environment variable. |
4325468 | The V4L2VideoCapture operator only supports YUYV and AB24 source pixel formats, and only outputs the RGBA GXF video format. Other source pixel formats compatible with V4L2 can be manually defined by the user, but they're assumed to be equivalent to RGBA8888. |
4325585 | Applications using MultiThreadScheduler may exit early due to timeouts. This occurs when the stop_on_deadlock_timeout parameter is improperly set to a value equal to or less than check_recession_period_ms , particularly if check_recession_period_ms is greater than zero. |
4301203 | HDMI IN fails in v4l2_camera on IGX Orin Devkit for some resolution or formats. Try the latest firmware as a partial fix. Driver-level fixes expected in IGX SW 1.0 GA. |
4384348 | UCX termination (either ctrl+c , press 'Esc' or clicking close button) is not smooth and can show multiple error messages. |
4481171 | Running the driver for a distributed applications on IGX Orin devkits fails when connected to other systems through eth1. A workaround is to use eth0 port to connect to other systems for distributed workloads. |
4458192 | In scenarios where distributed applications have both the driver and workers running on the same host, either within a Docker container or directly on the host, there's a possibility of encountering "Address already in use" errors. A potential solution is to assign a different port number to the HOLOSCAN_HEALTH_CHECK_PORT environment variable (default: 8777 ), for example, by using export HOLOSCAN_HEALTH_CHECK_PORT=8780 . |
v0.6.0-dgpu
and v0.6.0-igpu
holoscan==0.6.0
v0.6.0-amd64
and v0.6.0-arm64
HOLOSCAN_LOG_FORMAT
environment variable has been added to allow user to modify logger message format at runtimeHOLOSCAN_LOG_LEVEL
) and YAML config path (HOLOSCAN_CONFIG_PATH
)holoscan::InputContext::receive
has been modified to return holoscan::expected<DataT, holoscan::RuntimeError>
instead of std::shared_ptr<DataT>
where it returns either a valid value or an error (with the type and explanation of the error). Note that IO objects are not all assumed to be wrapped in a std::shared_ptr
anymore.gxf::Entity
between GXF based Operators and Holoscan Native Operators have been modified to type holoscan::TensorMap
in C++ and dict
type of objects in Python .holoscan/operators/tensor_rt/tensor_rt_inference.hpp
removedholoscan/operators/multiai_inference/multiai_inference.hpp
renamed to holoscan/operators/inference/inference.hpp
holoscan/operators/multiai_postprocessor/multiai_postprocessor.hpp
renamed to holoscan/operators/inference_processor/inference_processor.hpp
holoscan::ops::TensorRtInferenceOp
removedholoscan::ops::MultiAIInferenceOp
renamed to holoscan::ops::InferenceOp
holoscan::ops::MultiAIPostprocessorOp
renamed to holoscan::ops::InferenceProcessorOp
holoscan::ops::tensor_rt
removedholoscan::ops::multiai_inference
renamed to holoscan::ops::inference
holoscan::ops::multiai_postprocessor
renamed to holoscan::ops::inference_processor
holoscan::load_env_log_level
has been removed. The HOLOSCAN_LOG_LEVEL
environment is now loaded automatically.ops::VideoDecoderOp
has been replaced with the classes ops::VideoDecoderRequestOp
, ops::VideoDecoderResponseOp
and ops::VideoDecoderContext
ops::VideoEncoderOp
has been replaced with the classes ops::VideoEncoderRequestOp
, ops::VideoEncoderResponseOp
and ops::VideoEncoderContext
Issue | Description |
---|---|
3762996 | nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. |
4048062 | Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT |
4036186 | H264 encoder/decoder are not supported on iGPU |
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Platform | OS |
---|---|
NVIDIA IGX Orin Developer Kit | NVIDIA HoloPack 2.0 (L4T r35.3.1) or Meta Tegra Holoscan 0.6.0 (L4T r35.3.1) |
NVIDIA Jetson AGX Orin Developer Kit | NVIDIA JetPack r35.1.1 |
NVIDIA Clara AGX Developer Kit | NVIDIA HoloPack 1.2 (L4T r34.1.2) or Meta Tegra Holoscan 0.6.0 (L4T r35.3.1) |
x86_64 platforms with Ampere GPU or above(tested with RTX6000 and A6000) |
Ubuntu 20.04 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
3878494 | Inference fails after tensorrt engine file is first created using BlockMemoryPool . Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app on Clara AGX developer kits. Fix available in CUDA drivers 520. Workaround implemented since v0.4 to retry automatically. |
4047688 | H264 applications are missing dependencies (nvidia-l4t-multimedia-utils ) to run in the arm64 dGPU container |
4062979 | When Operators connected in a Directed Acyclic Graph (DAG) are executed in a multithreaded scheduler, it is not ensured that their execution order in the graph is adhered. |
4068454 | Crash on systems with NVIDIA and non-NVIDIA GPUs. Workaround documented in Troubleshooting section of the GitHub README. |
4101714 | Vulkan applications fail (vk::UnknownError ) in containers on iGPU due to missing iGPU device node being mounted in the container. Workaround documented in run instructions. |
4171337 | AJA with RDMA is not working on integrated GPU (IGX or AGX Orin) due to conflicts between the nvidia-p2p and nvidia driver symbols (nvidia_p2p_dma_map_pages ). Fixed in JetPack 5.1.2, expected in HoloPack 2.1 |
4185260 | H264 application process hangs after X11 video exit. |
4185976 | Cycle in a graph is not supported. As a consequence, the endoscopy tool tracking example using input from an AJA video card in enabled overlay configuration is unfunctional. This is planned to be addressed in the next version of the SDK. |
4187826 | Torch backend in the Inference operator is not supported on Tegra's integrated GPU. |
4187787 | TensorRT backend in the Inference operator prints Unknown embedded device detected. Using 52000MiB as the allocation cap for memory on embedded devices on IGX Orin (iGPU). Addressed in TensorRT 8.6+. |
4190019 | Holoviz segfaults on multi-gpu setup when specifying device using the --gpus flag with docker run . Current workaround is to use CUDA_VISIBLE_DEVICES in the container instead. |
4196152 | Getting "Unable to find component from the name ''" error message when using InferenceOp with Data Flow Tracking enabled. |
4199282 | H264 applications may fail on x86_64 due to ldd picking up system v4l2 libraries ahead of the the embedded nvv4l2 libraries. |
4206197 | Distributed apps hang if multiple input/output ports are connected between two operators in different fragments. |
4210082 | v4l_camera example seg faults at exit. |
4212743 | Holoscan CLI packager copies into the App Package the unrelated files and folders in the same folder than the model file. |
4211815 | Cannot build AJA drivers on x86_64 with NVIDIA drivers 535. This works with previous NVIDIA driver versions. |
4211747 | Communication of GPU tensors between fragments in a distributed application can only use device (GPU) 0. |
4214189 | High CPU usage with video_replayer_distributed app. |
4232453 | A segfault occurs if a native Python operator __init__ assigns a new attribute that overrides an existing base class attribute or method. A segfault will also occur if any exception is raised during Operator.__init__ or Application.__init__ before the parent class __init__ was called. |
4260969 | App add_flow causes issue if called more than once between a pair of operators. |
4272363 | A segfault may occur if an operator's output port containing GXF Tensor data is linked to multiple operators within the MultiThreadScheduler. |
4293729 | Python application using MultiThreadScheduler (including distributed application) may fail with GIL related error if SDK was compiled in debug mode. |
v0.5.1-dgpu
and v0.5.1-igpu
holoscan==0.5.1
v0.5.1-amd64
and v0.5.1-arm64
This release of the Holoscan SDK provide the following additions:
The Holoscan SDK 0.5.1 adds support for the NVIDIA IGX Orin Developer Kit in both iGPU or dGPU modes. That support is enabled by the release of the Holopack 2.0 Developer Preview now available through the latest version of the SDK Manager. Python wheels and Debian packages for the arm64/aarch64 architecture support both iGPU and dGPU. Starting with 0.5.1, the Holoscan container on NGC now offers two separate tags, one for iGPU and one for dGPU.
Python users might be interested in using the Holoscan SDK without requiring to import every operators, which requires to have all their dependencies available in the python environment (example: TensorRT...). With 0.5.1, a user can create an Holoscan application by only importing the modules they require, for example, importing holoscan.core
only and not holoscan.operators
.
Note: in 0.6.0, the operators will be broken down in separate modules to offer further importing granularity and lower dependency requirements.
Facilitates conversions of standard video formats to gxf entities in the Holoscan containers (example)
The L4T Compute Assist container - used to run compute workloads on the iGPU of a developer kit configured for dGPU - now includes the deviceQuery
executable to facilitate validating its configuration, along with updated troubleshooting steps.
Issue | Description |
---|---|
- | Fixed instructions in documentation for datasets download and debian package installation |
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Platform | OS |
---|---|
NVIDIA Clara AGX Developer Kit | - NVIDIA Holopack 1.2 (L4T r34.1.2) - Meta Tegra Holoscan 0.5.0 (L4T r35.2.1) |
NVIDIA IGX Orin [ES] Developer Kit | - NVIDIA Holopack 1.2 (L4T r34.1.2) - Meta Tegra Holoscan 0.5.0 (L4T r35.2.1) |
NVIDIA IGX Orin Developer Kit | - NVIDIA Holopack 2.0 (L4T r35.4.0) - Meta Tegra Holoscan 0.5.1 (L4T r35.3.1) |
x86_64 platforms with Ampere GPU or above(tested with RTX6000 and A6000) |
Ubuntu 20.04 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
3878494 | Inference fails after tensorrt engine file is first created using BlockMemoryPool . Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround. |
3762996 | nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide. |
3655489 | Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
3633688 | RDMA on the NVIDIA IGX Orin [ES] Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app. Fix available in CUDA drivers 520. Workaround implemented in v0.4 to retry automatically. |
4048062 | Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT |
4036186 | H264 encoder/decoder are not supported on iGPU |
4047688 | H264 applications are not able to run in the arm64 dGPU container |
4101714 | --privileged permission required to run rendering applications from the Holoscan iGPU container on IGX Orin Developer Kit with Holopack 2.0 DP |
4116861 | H264 video encoding fails on IGX Orin Developer Kit with Holopack 2.0 DP |
v0.5.0
holoscan==0.5.0
v0.5.0-amd64
and v0.5.0-arm64
This release of the Holoscan SDK along with additions to HoloHub provide the following main features:
Operators to support H264 bitstream accelerated encoder and decoder were added to HoloHub, as illustrated by two new applications: h264_video_decode and h264_endoscopy_tool_tracking.
The L4T Compute Assist container is now available on NGC to perform computation on the integrated GPU (iGPU) of Holoscan Developer Kits configured to use their discrete GPU (dGPU), allowing to run workloads on both GPUs in parallel.
Infrastructure and documentation were added to wrap Holoscan operators as GXF codelets so they can be used by other frameworks which use GXF extensions.
The Holoscan SDK now officially support physical I/O on x86_64 platforms. The High Speed endoscopy application on HoloHub has been tested with Rivermax/GPU Direct RDMA support and offers similar performances as previously reported with the Holoscan Developer Kits.
The Holoscan SDK visualization module (referred to as Holoviz) adds depth-map rendering capabilities to support displaying inference results with depth information.
The Holoscan SDK now provides a new suite of examples with associated step-by-step documentation to better introduce users to the SDK, taking them from a Hello World example to an application that deploys an ultrasound segmentation inference. Additional examples are also available to demonstrate how to integrate sensors and third-party frameworks into their workflow.
Issue | Description |
---|---|
3834424 | Ultrasound segmentation application is not functional on NVIDIA IGX Orin [ES] Developer Kit with iGPU configuration in deployment stack |
3842899 | High-Speed Endoscopy application is not supported in deployment stack |
3897810 | Applications not working on x86_64 systems with multiple GPUs |
3936290 | Cannot run exclusive display from docker container |
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Platform | OS |
---|---|
NVIDIA Clara AGX Developer Kit | - NVIDIA Holopack 1.2 (L4T r34.1.2) - Meta Tegra Holoscan 0.5.0 (L4T r35.2.1) |
NVIDIA IGX Orin [ES] Developer Kit | - NVIDIA Holopack 1.2 (L4T r34.1.2) - Meta Tegra Holoscan 0.5.0 (L4T r35.2.1) |
NVIDIA IGX Orin Developer Kit | Meta Tegra Holoscan 0.5.0 (L4T r35.2.1) |
x86_64 platforms with Ampere GPU or above(tested with RTX6000 and A6000) |
Ubuntu 20.04 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
3878494 | Inference fails after tensorrt engine file is first created using BlockMemoryPool . Fix available in TensorRT 8.4.1. Use UnboundedAllocator as a workaround. |
3762996 | nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide. |
3655489 | Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
3633688 | RDMA on the NVIDIA IGX Orin [ES] Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app. Fix available in CUDA drivers 520. Workaround implemented in v0.4 to retry automatically. |
4048062 | Warning or error when deleting TensorRT runtime ahead of deserialized engines for some versions of TensorRT |
4036186 | H264 encoder/decoder are not supported on iGPU |
4047688 | H264 applications are missing dependencies (nvidia-l4t-multimedia-utils ) to run in the arm64 dGPU container |
apt update
Please see the list of known issues below for more information.
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Description | Supported Version |
---|---|
Supported NVIDIA® Tegra® Linux Driver Package (L4T) | NVIDIA® Holopack 1.2 -- R34.1.2 |
Supported Jetson Platforms | Holoscan Developer Kits |
Supported x86_64 Platforms | Ubuntu 20.04 with Ampere GPU or above (tested with RTX6000 and A6000) |
Supported Software for Clara AGX Developer Kit with NVIDIA® RTX6000 and IGX Orin Developer Kit with NVIDIA® A6000 |
NVIDIA® Driver 510.73.08 CUDA 11.6.1 TensorRT 8.2.3 GXF 2.5 AJA NTV2 SDK 16.2 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
3878494 | Inference fails after tensorrt engine file is first created using BlockMemoryPool |
3762996 | nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide. |
3655489 | Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
3633688 | RDMA on the NVIDIA IGX Orin Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue. |
3834424 | Ultrasound segmentation application is not functional on NVIDIA IGX Orin Developer Kit (holoscan-devkit) with iGPU configuration in deployment stack |
3842899 | High-Speed Endoscopy application is not supported in deployment stack. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app (workaround implemented in v0.4 and fix in available in 520 drivers) |
3897810 | Applications not working on x86_64 systems with multiple GPUs |
3936290 | Cannot run exclusive display from docker container |
This new release of Holoscan SDK brings the following main features:
Python API A new Python API which allows for rapid prototyping and deployment of AI workflows has been added. The python API is accessible through the provided container or source build as well as PyPi.
Inference module A new Holoscan Inference module (HoloInfer) has been added. The HoloInfer module facilitates designing and executing inference and processing applications through its APIs.
Multi-AI Inference Extension A Multi AI Inference codelet is provided, which takes multiple AI inference related parameters and can ingest multiple GXF messages and transmits multiple tensors as the output via a single GXF Transmitter.
Native Operators using C++/Python APIs The capability to write native operators in C++ and Python has been added and several sample Holoscan native operators are provided.
FPGA Alpha-Blending The FPGA Alpha-Blending using AJA capture card allows very-low latency passthrough of the input video to output, while blending the AI inference overlay directly on the input signal on the FPGA on AJA PCIe card.
Please see the list of known issues below for more information.
Note: This release is intended for use with the listed platforms only. NVIDIA does not provide support for this release on products other than those listed below.
Description | Supported Version |
---|---|
Supported NVIDIA® Tegra® Linux Driver Package (L4T) | NVIDIA® Holopack 1.1 -- R34.1.2 |
Supported Jetson Platforms | Holoscan Developer Kits |
Supported x86 Platforms | Ubuntu 20.04 with Turing/Ampere GPU |
Supported Software for Clara AGX Developer Kit with NVIDIA® RTX6000 and IGX Orin Developer Kit with NVIDIA® A6000 |
NVIDIA® Driver 510.73.08 CUDA 11.6.1 TensorRT 8.2.3 GXF 2.5 AJA NTV2 SDK 16.2 |
This section supplies details about issues discovered during development and QA but not resolved in this release.
Issue | Description |
---|---|
3878494 | Inference fails after tensorrt engine file is first created using BlockMemoryPool |
3762996 | nvidia-peermem.ko fails to load using insmod on Holoscan devkits in dGPU mode. Install nvidia-peer-memory following the RDMA instructions in the Holoscan SDK User Guide. |
3655489 | Installing dGPU drivers can remove nvgpuswitch.py script from the executable search path. Explicitly including /opt/nvidia/l4t-gputools/bin in the PATH environment variable ensures this script can be found for execution. |
3599303 | Linux kernel is not built with security hardening flags. Future releases will include a Linux kernel built with security hardening flags. |
3633688 | RDMA on the NVIDIA IGX Orin Developer Kit (holoscan-devkit) is not functional. PCIe switch firmware update fixed the issue. RDMA for the Clara AGX Developer Kit is functional and unaffected by this issue. |
3834424 | Ultrasound segmentation application is not functional on NVIDIA IGX Orin Developer Kit (holoscan-devkit) with iGPU configuration in deployment stack |
3842899 | High-Speed Endoscopy application is not supported in deployment stack. |
3881725 | VK_ERROR_INITIALIZATION_FAILED with segmentation fault while running High-Speed Endoscopy gxf/cpp app (workaround implemented in v0.4 and fix in available in 520 drivers) |
3897810 | Applications not working on x86_64 systems with multiple GPUs |
3936290 | Cannot run exclusive display from docker container |
Release notes in progress