Skip to content

Commit 5f5e451

Browse files
Feat: Tons of cust changes and start of OptiX (hardware rt) work
* wip * bootstrap enough optix to get ex02 working * Add example 03 Generate an animated pattern in the raygen and display it in a window using glfw * add logging callback * remove ustr * Manually create OptixShaderBindingTable field-by-field instead of transmute * Switch Module and Pipeline methods to their structs Instead of having them on DeviceContext * Switch Module, Pipeline, ProgramGroup methods to their structs Instead of having them on DeviceContext * Refactor: remove dead imports * derive DeviceCopy * typo * Better error message * Move destroy to Drop impl * typo * rename OptixContext to DeviceContext * Make launch params variable name optional * Remove Clone from Module and ProgramGroup * Make log callback safe User catch_unwind to guard against panic in C. Remove note about lifetime of closure since it's 'static anyway. Have set_log_callback return a Result instead of panicking on error. * add wip glam support * dont panic in drop * Rework DevicePointer on top of CUdeviceptr This switches out *T for CUdeviceptr in DevicePointer. This has the knock-on effect of removing a lot of "pretend we're a CPU pointer" stuff from downstream types like DeviceSlice. * wip * bootstrap enough optix to get ex02 working * Add example 03 Generate an animated pattern in the raygen and display it in a window using glfw * add logging callback * remove ustr * Manually create OptixShaderBindingTable field-by-field instead of transmute * Switch Module and Pipeline methods to their structs Instead of having them on DeviceContext * Switch Module, Pipeline, ProgramGroup methods to their structs Instead of having them on DeviceContext * Refactor: remove dead imports * derive DeviceCopy * typo * Better error message * Move destroy to Drop impl * typo * rename OptixContext to DeviceContext * Make launch params variable name optional * Remove Clone from Module and ProgramGroup * Make log callback safe User catch_unwind to guard against panic in C. Remove note about lifetime of closure since it's 'static anyway. Have set_log_callback return a Result instead of panicking on error. * add wip glam support * dont panic in drop * wip accel support * Add accel wip Enough acceleration structure stuff to get example 04 running, and rebasing on top of deviceptr branch * Rework acceleration structure stuff Provide simple internally allocating API for Accel, but also allow creating one from raw parts to let user handle memory allocation. Original API kept as free functions and marked unsafe. Implement all build input types. Add mint support to cust and optix. * add lifetime bound on Instance to referenced Accel * Have DeviceCopy impl for lifetime markers use null type * Add unsaafe from_handle ctor * Add update for DynamicAccel * Hash build inputs to ensure update is sound * Add relocation info * Add remaning DeviceContext methods * Correct docstrings * Add doc comments * Add a prelude * Own the geometry flags array * Add prelude * own the geometry flags array and add support for pre_transform * Fill out context and add some module docs * Add some module docs * Update to latest library changes * Add more docs * Remove mut requirement for getting pointer * Add a simple memcpy_htod wrapper * Add back pointer offset methods * Big structure reorg and documentation push - Reorganized the module structure to something less fragmented. Modules are longer but more cohesive. - Integrated the optix programming guide inline into the module documentation. You need to build the docs with: RUSTDOCFLAGS="--html-in-header katex-header.html" cargo doc --no-deps To see the equations (this is done automatically on docs.rs) * Wrap SBT properly * Rename transform types * Simplify AccelBuildOptions creation Just take a build flags and move everything else to builders * Hide programming guide in details tags * Adapt to latest changes * Fix toolchain version * Fix name of DeviceContext * first optix rust test * Set ALLOW_COMPACTION in build options * Use find_cuda_helper to get cuda path * Handle differering enum representation on windows and linux * Add DeviceVariable * Add DeviceMemory trait Abstracts over different device storage representations * Add mem_get_info * Add external memory * Add a few more types to prelude * Add more types * Rework on top of new DeviceVariable * first optix rust test * tweak build * update to latest optix changes * Split DeviceCopy into cust_core * update to latest optix changes * trying to get print working * tweak test kernel * stop llvm optimizing out LaunchParams * Chore: update cargo.toml dep versions * Feat: second pass for fixing conflicts * Feat: delete as_ptr and as_mut_ptr on DeviceSlice * Revert "Feat: delete as_ptr and as_mut_ptr on DeviceSlice" This reverts commit e858fdc. * Feat: experiment with deleting as_ptr and as_mut_ptr * Fix issues and warnings * Chore: run formatting * Chore: exclude examples from building in CI * Feat: update changelog with changes, misc changes before merge Co-authored-by: rdambrosio <[email protected]>
1 parent fd87b73 commit 5f5e451

File tree

107 files changed

+13248
-1278
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

107 files changed

+13248
-1278
lines changed

.github/workflows/rust.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ jobs:
5959
run: cargo fmt --all -- --check
6060

6161
- name: Build
62-
run: cargo build --workspace --exclude "optix" --exclude "optix_sys" --exclude "path_tracer" --exclude "denoiser" --exclude "add"
62+
run: cargo build --workspace --exclude "optix" --exclude "path_tracer" --exclude "denoiser" --exclude "add" --exclude "ex*"
6363

6464
# Don't currently test because many tests rely on the system having a CUDA GPU
6565
# - name: Test
@@ -69,9 +69,9 @@ jobs:
6969
if: contains(matrix.os, 'ubuntu')
7070
env:
7171
RUSTFLAGS: -Dwarnings
72-
run: cargo clippy --workspace --exclude "optix" --exclude "optix_sys" --exclude "path_tracer" --exclude "denoiser" --exclude "add"
72+
run: cargo clippy --workspace --exclude "optix" --exclude "path_tracer" --exclude "denoiser" --exclude "add" --exclude "ex*"
7373

7474
- name: Check documentation
7575
env:
7676
RUSTDOCFLAGS: -Dwarnings
77-
run: cargo doc --workspace --all-features --document-private-items --no-deps --exclude "optix" --exclude "optix_sys" --exclude "path_tracer" --exclude "denoiser" --exclude "add"
77+
run: cargo doc --workspace --all-features --document-private-items --no-deps --exclude "optix" --exclude "path_tracer" --exclude "denoiser" --exclude "add" --exclude "ex*"

Cargo.toml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
[workspace]
22
members = [
33
"crates/*",
4-
4+
"crates/optix/examples/ex*",
5+
"crates/optix/examples/ex*/device",
56
"xtask",
67

78
"examples/optix/*",
@@ -10,5 +11,9 @@ members = [
1011

1112
]
1213

14+
exclude = [
15+
"crates/optix/examples/common"
16+
]
17+
1318
[profile.dev.package.rustc_codegen_nvvm]
1419
opt-level = 3

crates/blastoff/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ repository = "https://github.com/Rust-GPU/Rust-CUDA"
88
[dependencies]
99
bitflags = "1.3.2"
1010
cublas_sys = { version = "0.1", path = "../cublas_sys" }
11-
cust = { version = "0.2", path = "../cust", features = ["num-complex"] }
11+
cust = { version = "0.2", path = "../cust", features = ["impl_num_complex"] }
1212
num-complex = "0.4.0"
1313

1414
[package.metadata.docs.rs]

crates/blastoff/src/level1.rs

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,9 @@ impl CublasContext {
4646
Ok(T::amin(
4747
ctx.raw,
4848
n as i32,
49-
x.as_device_ptr().as_raw(),
49+
x.as_device_ptr().as_ptr(),
5050
stride.unwrap_or(1) as i32,
51-
result.as_device_ptr().as_raw_mut(),
51+
result.as_device_ptr().as_mut_ptr(),
5252
)
5353
.to_result()?)
5454
})
@@ -108,9 +108,9 @@ impl CublasContext {
108108
Ok(T::amax(
109109
ctx.raw,
110110
n as i32,
111-
x.as_device_ptr().as_raw(),
111+
x.as_device_ptr().as_ptr(),
112112
stride.unwrap_or(1) as i32,
113-
result.as_device_ptr().as_raw_mut(),
113+
result.as_device_ptr().as_mut_ptr(),
114114
)
115115
.to_result()?)
116116
})
@@ -172,10 +172,10 @@ impl CublasContext {
172172
Ok(T::axpy(
173173
ctx.raw,
174174
n as i32,
175-
alpha.as_device_ptr().as_raw(),
176-
x.as_device_ptr().as_raw(),
175+
alpha.as_device_ptr().as_ptr(),
176+
x.as_device_ptr().as_ptr(),
177177
x_stride.unwrap_or(1) as i32,
178-
y.as_device_ptr().as_raw_mut(),
178+
y.as_device_ptr().as_mut_ptr(),
179179
y_stride.unwrap_or(1) as i32,
180180
)
181181
.to_result()?)
@@ -245,9 +245,9 @@ impl CublasContext {
245245
Ok(T::copy(
246246
ctx.raw,
247247
n as i32,
248-
x.as_device_ptr().as_raw(),
248+
x.as_device_ptr().as_ptr(),
249249
x_stride.unwrap_or(1) as i32,
250-
y.as_device_ptr().as_raw_mut(),
250+
y.as_device_ptr().as_mut_ptr(),
251251
y_stride.unwrap_or(1) as i32,
252252
)
253253
.to_result()?)
@@ -314,11 +314,11 @@ impl CublasContext {
314314
Ok(T::dot(
315315
ctx.raw,
316316
n as i32,
317-
x.as_device_ptr().as_raw(),
317+
x.as_device_ptr().as_ptr(),
318318
x_stride.unwrap_or(1) as i32,
319-
y.as_device_ptr().as_raw(),
319+
y.as_device_ptr().as_ptr(),
320320
y_stride.unwrap_or(1) as i32,
321-
result.as_device_ptr().as_raw_mut(),
321+
result.as_device_ptr().as_mut_ptr(),
322322
)
323323
.to_result()?)
324324
})
@@ -390,11 +390,11 @@ impl CublasContext {
390390
Ok(T::dotu(
391391
ctx.raw,
392392
n as i32,
393-
x.as_device_ptr().as_raw(),
393+
x.as_device_ptr().as_ptr(),
394394
x_stride.unwrap_or(1) as i32,
395-
y.as_device_ptr().as_raw(),
395+
y.as_device_ptr().as_ptr(),
396396
y_stride.unwrap_or(1) as i32,
397-
result.as_device_ptr().as_raw_mut(),
397+
result.as_device_ptr().as_mut_ptr(),
398398
)
399399
.to_result()?)
400400
})
@@ -438,11 +438,11 @@ impl CublasContext {
438438
Ok(T::dotc(
439439
ctx.raw,
440440
n as i32,
441-
x.as_device_ptr().as_raw(),
441+
x.as_device_ptr().as_ptr(),
442442
x_stride.unwrap_or(1) as i32,
443-
y.as_device_ptr().as_raw(),
443+
y.as_device_ptr().as_ptr(),
444444
y_stride.unwrap_or(1) as i32,
445-
result.as_device_ptr().as_raw_mut(),
445+
result.as_device_ptr().as_mut_ptr(),
446446
)
447447
.to_result()?)
448448
})
@@ -483,9 +483,9 @@ impl CublasContext {
483483
Ok(T::nrm2(
484484
ctx.raw,
485485
n as i32,
486-
x.as_device_ptr().as_raw(),
486+
x.as_device_ptr().as_ptr(),
487487
x_stride.unwrap_or(1) as i32,
488-
result.as_device_ptr().as_raw_mut(),
488+
result.as_device_ptr().as_mut_ptr(),
489489
)
490490
.to_result()?)
491491
})
@@ -559,12 +559,12 @@ impl CublasContext {
559559
Ok(T::rot(
560560
ctx.raw,
561561
n as i32,
562-
x.as_device_ptr().as_raw_mut(),
562+
x.as_device_ptr().as_mut_ptr(),
563563
x_stride.unwrap_or(1) as i32,
564-
y.as_device_ptr().as_raw_mut(),
564+
y.as_device_ptr().as_mut_ptr(),
565565
y_stride.unwrap_or(1) as i32,
566-
c.as_device_ptr().as_raw(),
567-
s.as_device_ptr().as_raw(),
566+
c.as_device_ptr().as_ptr(),
567+
s.as_device_ptr().as_ptr(),
568568
)
569569
.to_result()?)
570570
})

crates/cudnn/src/context.rs

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -397,9 +397,9 @@ impl CudnnContext {
397397
let x_data = x.data().as_device_ptr().as_raw();
398398

399399
let y_desc = y.descriptor();
400-
let y_data = y.data().as_device_ptr().as_raw_mut();
400+
let y_data = y.data().as_device_ptr().as_ptr();
401401

402-
let reserve_space_ptr = reserve_space.as_device_ptr().as_raw_mut();
402+
let reserve_space_ptr = reserve_space.as_device_ptr().as_ptr();
403403

404404
unsafe {
405405
sys::cudnnDropoutForward(
@@ -454,9 +454,9 @@ impl CudnnContext {
454454
let dy_data = dy.data().as_device_ptr().as_raw();
455455

456456
let dx_desc = dx.descriptor();
457-
let dx_data = dx.data().as_device_ptr().as_raw_mut();
457+
let dx_data = dx.data().as_device_ptr().as_ptr();
458458

459-
let reserve_space_ptr = reserve_space.as_device_ptr().as_raw_mut();
459+
let reserve_space_ptr = reserve_space.as_device_ptr().as_ptr();
460460

461461
unsafe {
462462
sys::cudnnDropoutBackward(
@@ -528,7 +528,7 @@ impl CudnnContext {
528528
raw,
529529
self.raw,
530530
dropout,
531-
states.as_device_ptr().as_raw_mut() as *mut std::ffi::c_void,
531+
states.as_device_ptr().as_ptr() as *mut std::ffi::c_void,
532532
states.len(),
533533
seed,
534534
)
@@ -1185,14 +1185,14 @@ impl CudnnContext {
11851185
let w_data = w.data().as_device_ptr().as_raw();
11861186
let w_desc = w.descriptor();
11871187

1188-
let y_data = y.data().as_device_ptr().as_raw_mut();
1188+
let y_data = y.data().as_device_ptr().as_ptr();
11891189
let y_desc = y.descriptor();
11901190

11911191
// If the _ size is 0 then the algorithm can work in-place and cuDNN expects a null
11921192
// pointer.
11931193
let (work_space_ptr, work_space_size): (*mut u8, usize) = {
11941194
work_space.map_or((std::ptr::null_mut(), 0), |work_space| {
1195-
(work_space.as_device_ptr().as_raw_mut(), work_space.len())
1195+
(work_space.as_device_ptr().as_mut_ptr(), work_space.len())
11961196
})
11971197
};
11981198

@@ -1287,12 +1287,12 @@ impl CudnnContext {
12871287
let dy_data = dy.data().as_device_ptr().as_raw();
12881288
let dy_desc = dy.descriptor();
12891289

1290-
let dx_data = dx.data().as_device_ptr().as_raw_mut();
1290+
let dx_data = dx.data().as_device_ptr().as_ptr();
12911291
let dx_desc = dx.descriptor();
12921292

12931293
let (work_space_ptr, work_space_size): (*mut u8, usize) = {
12941294
work_space.map_or((std::ptr::null_mut(), 0), |work_space| {
1295-
(work_space.as_device_ptr().as_raw_mut(), work_space.len())
1295+
(work_space.as_device_ptr().as_mut_ptr(), work_space.len())
12961296
})
12971297
};
12981298

@@ -1388,12 +1388,12 @@ impl CudnnContext {
13881388
let dy_data = dy.data().as_device_ptr().as_raw();
13891389
let dy_desc = dy.descriptor();
13901390

1391-
let dw_data = dw.data().as_device_ptr().as_raw_mut();
1391+
let dw_data = dw.data().as_device_ptr().as_ptr();
13921392
let dw_desc = dw.descriptor();
13931393

13941394
let (work_space_ptr, work_space_size): (*mut u8, usize) = {
13951395
work_space.map_or((std::ptr::null_mut(), 0), |work_space| {
1396-
(work_space.as_device_ptr().as_raw_mut(), work_space.len())
1396+
(work_space.as_device_ptr().as_mut_ptr(), work_space.len())
13971397
})
13981398
};
13991399

@@ -1615,28 +1615,28 @@ impl CudnnContext {
16151615
L: RnnDataLayout,
16161616
NCHW: SupportedType<T1>,
16171617
{
1618-
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_raw();
1618+
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_ptr();
16191619

16201620
let x_ptr = x.as_device_ptr().as_raw();
1621-
let y_ptr = y.as_device_ptr().as_raw_mut();
1621+
let y_ptr = y.as_device_ptr().as_ptr();
16221622

1623-
let hx_ptr = hx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1623+
let hx_ptr = hx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_ptr());
16241624
let hy_ptr = hy.map_or(std::ptr::null_mut(), |buff| {
1625-
buff.as_device_ptr().as_raw_mut()
1625+
buff.as_device_ptr().as_mut_ptr()
16261626
});
16271627

16281628
let c_desc = c_desc.map_or(std::ptr::null_mut(), |desc| desc.raw);
16291629

1630-
let cx_ptr = cx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1630+
let cx_ptr = cx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_ptr());
16311631
let cy_ptr = cy.map_or(std::ptr::null_mut(), |buff| {
1632-
buff.as_device_ptr().as_raw_mut()
1632+
buff.as_device_ptr().as_mut_ptr()
16331633
});
16341634

1635-
let weight_space_ptr = weight_space.as_device_ptr().as_raw_mut();
1636-
let work_space_ptr = work_space.as_device_ptr().as_raw_mut();
1635+
let weight_space_ptr = weight_space.as_device_ptr().as_ptr();
1636+
let work_space_ptr = work_space.as_device_ptr().as_ptr();
16371637
let (reserve_space_ptr, reserve_space_size) = reserve_space
16381638
.map_or((std::ptr::null_mut(), 0), |buff| {
1639-
(buff.as_device_ptr().as_raw_mut(), buff.len())
1639+
(buff.as_device_ptr().as_mut_ptr(), buff.len())
16401640
});
16411641

16421642
unsafe {
@@ -1814,32 +1814,32 @@ impl CudnnContext {
18141814
L: RnnDataLayout,
18151815
NCHW: SupportedType<T1>,
18161816
{
1817-
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_raw();
1817+
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_ptr();
18181818

18191819
let y_ptr = y.as_device_ptr().as_raw();
18201820
let dy_ptr = dy.as_device_ptr().as_raw();
18211821

1822-
let dx_ptr = dx.as_device_ptr().as_raw_mut();
1822+
let dx_ptr = dx.as_device_ptr().as_ptr();
18231823

18241824
let h_desc = h_desc.map_or(std::ptr::null_mut(), |desc| desc.raw);
18251825

1826-
let hx_ptr = hx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1827-
let dhy_ptr = dhy.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1826+
let hx_ptr = hx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_ptr());
1827+
let dhy_ptr = dhy.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_ptr());
18281828
let dhx_ptr = dhx.map_or(std::ptr::null_mut(), |buff| {
1829-
buff.as_device_ptr().as_raw_mut()
1829+
buff.as_device_ptr().as_mut_ptr()
18301830
});
18311831

18321832
let c_desc = c_desc.map_or(std::ptr::null_mut(), |desc| desc.raw);
18331833

1834-
let cx_ptr = cx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1835-
let dcy_ptr = dcy.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_raw());
1834+
let cx_ptr = cx.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_ptr());
1835+
let dcy_ptr = dcy.map_or(std::ptr::null(), |buff| buff.as_device_ptr().as_mut_ptr());
18361836
let dcx_ptr = dcx.map_or(std::ptr::null_mut(), |buff| {
1837-
buff.as_device_ptr().as_raw_mut()
1837+
buff.as_device_ptr().as_mut_ptr()
18381838
});
18391839

1840-
let weight_space_ptr = weight_space.as_device_ptr().as_raw_mut();
1841-
let work_space_ptr = work_space.as_device_ptr().as_raw_mut();
1842-
let reserve_space_ptr = reserve_space.as_device_ptr().as_raw_mut();
1840+
let weight_space_ptr = weight_space.as_device_ptr().as_ptr();
1841+
let work_space_ptr = work_space.as_device_ptr().as_ptr();
1842+
let reserve_space_ptr = reserve_space.as_device_ptr().as_ptr();
18431843

18441844
unsafe {
18451845
sys::cudnnRNNBackwardData_v8(
@@ -1947,15 +1947,15 @@ impl CudnnContext {
19471947
L: RnnDataLayout,
19481948
NCHW: SupportedType<T1>,
19491949
{
1950-
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_raw();
1950+
let device_sequence_lengths_ptr = device_seq_lengths.as_device_ptr().as_mut_ptr();
19511951

19521952
let x_ptr = x.as_device_ptr().as_raw();
19531953
let hx_ptr = x.as_device_ptr().as_raw();
19541954
let y_ptr = y.as_device_ptr().as_raw();
19551955

1956-
let dweight_space_ptr = dweight_space.as_device_ptr().as_raw_mut();
1957-
let work_space_ptr = work_space.as_device_ptr().as_raw_mut();
1958-
let reserve_space_ptr = reserve_space.as_device_ptr().as_raw_mut();
1956+
let dweight_space_ptr = dweight_space.as_device_ptr().as_mut_ptr();
1957+
let work_space_ptr = work_space.as_device_ptr().as_mut_ptr();
1958+
let reserve_space_ptr = reserve_space.as_device_ptr().as_mut_ptr();
19591959

19601960
unsafe {
19611961
sys::cudnnRNNBackwardWeights_v8(

0 commit comments

Comments
 (0)