Skip to content

Instantly share code, notes, and snippets.

View vladisai's full-sized avatar

Vlad Sobal vladisai

View GitHub Profile
GPU activities: 15.18% 7.94888s 5447 1.4593ms 38.018us 36.894ms maxwell_scudnn_winograd_128x128_ldg1_ldg4_tile228n_nt
11.07% 5.79327s 11278 513.68us 3.8720us 3.2777ms void thrust::cuda_cub::core::_kernel_agent<thrust::cuda_cub::__merge_sort::MergeAgent<thrust::device_ptr<long>, thrust::device_ptr<float>, long, ThrustLTOp<long,
bool=0>, thrust::detail::integral_constant<bool, bool=1>>, bool, thrust::device_ptr<long>, thrust::device_ptr<float>, long, long*, float*, ThrustLTOp<long, bool=0>, long*, long>(thrust::device_ptr<long>, float, thrust::device_ptr<float>,
long, long, bool=0, ThrustLTOp<long, bool=0>, bool, bool=1)
9.47% 4.95518s 5800 854.34us 34.850us 34.208ms maxwell_scudnn_winograd_128x128_ldg1_ldg4_tile148n_nt
6.00% 3.14027s 3200 981.34us 414.54us 2.8808ms maxwell_scudnn_128x128_stridedB_splitK_interior_nn
4.81% 2.51612s 53720 46.837us 928ns 3.2044ms _ZN2at6native18elem