-
Notifications
You must be signed in to change notification settings - Fork 32
Try/catch wrap a call to deallocate in silent usm_host_allocator class to be used by std::vector for array metadata transfers #1791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
|
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_256 ran successfully. |
|
The log of test run with Testing with SYCL OS build internally, the crash is occurring in I will try to reproduce the issue though, as it keeps consistently reoccurring at the same test. |
Also remove --no-sycl-interface-test option, since DPCTLSyclInterface library is no longer so-versioned.
Wrote dpctl::tensor::offset_utils::usm_host_allocator<T> to allocate USM-host memory as storage to std::vector. Replaced uses of sycl::usm_memory<T, sycl::alloc::kind::host>. The new class derives from this, but overrides deallocate method to wrap call to base::deallocate in try/except. The exception, if caught, is printed but otherwise ignored, consistent like this is done on USMDeleter class used in dpctl.memory This is to work around sporadic crashes due to unhandled exception thrown by openCL::CPU driver, which appears to be benign. The issue was reported to CPU driver team, with native reproducer (compiler LLVM jira ticket 58387).
ccbd886 to
709b6bd
Compare
|
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_265 ran successfully. |
|
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_266 ran successfully. |
These were from previous year. Updated them to what DPC++ is using https://github.com/intel/llvm/blob/sycl/devops/dependencies.json#L27-L38 It might be nice to automate update these through some cron executed workflow.
|
Array API standard conformance tests for dpctl=0.18.0dev0=py310ha798474_276 ran successfully. |
ndgrigorian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nightly build passes now, so I think it would be good to get this in, looks good to me!
Introduced class
template <typename T> dpctl::tensor::offset_utils::usm_host_allocatorderiving fromsycl::usm_allocator<T, sycl::alloc::kind::host>that wraps the call tobase::deallocatein try/catch to prevent crashes due to seemingly benign exceptions thrown by call tosycl::freeby CPU device runtime which are under investigation.In case such an exception is caught, a message is printed to
std::cerr, but the exception is otherwise ignored.Run pytest with
-sfor testing with nightly sycl bundle to be able to see such message printed.Also remove use of
--no-sycl-interface-testoption, since DPCTLSyclInterface library is no longer so-versioned.