/// <summary cref="MemoryBuffer{T, TIndex}.CopyFromView( /// AcceleratorStream, ArrayView{T}, Index1)"/> protected internal unsafe override void CopyFromView( AcceleratorStream stream, ArrayView <T> source, Index1 targetOffset) { var binding = Accelerator.BindScoped(); var sourceAddress = new IntPtr(source.LoadEffectiveAddress()); var targetAddress = new IntPtr(ComputeEffectiveAddress(targetOffset)); switch (source.AcceleratorType) { case AcceleratorType.CPU: CudaException.ThrowIfFailed(CudaAPI.Current.MemcpyHostToDevice( targetAddress, sourceAddress, new IntPtr(source.LengthInBytes), stream)); break; case AcceleratorType.Cuda: CudaException.ThrowIfFailed(CudaAPI.Current.MemcpyDeviceToDevice( targetAddress, sourceAddress, new IntPtr(source.LengthInBytes), stream)); break; default: throw new NotSupportedException( RuntimeErrorMessages.NotSupportedTargetAccelerator); } binding.Recover(); }
/// <inheritdoc/> public override TimeSpan MeasureFrom(ProfilingMarker marker) { using var binding = Accelerator.BindScoped(); if (!(marker is CudaProfilingMarker startMarker)) { throw new ArgumentException( string.Format( RuntimeErrorMessages.InvalidProfilingMarker, GetType().Name, marker.GetType().Name), nameof(marker)); } // Wait for the markers to complete, then calculate the duration. startMarker.Synchronize(); Synchronize(); CudaException.ThrowIfFailed( CurrentAPI.ElapsedTime( out float milliseconds, startMarker.EventPtr, EventPtr)); return(TimeSpan.FromMilliseconds(milliseconds)); }
/// <summary cref="MemoryBuffer{T, TIndex}.CopyToView( /// AcceleratorStream, ArrayView{T}, LongIndex1)"/> protected internal unsafe override void CopyToView( AcceleratorStream stream, ArrayView <T> target, LongIndex1 sourceOffset) { var binding = Accelerator.BindScoped(); var targetBuffer = target.Source; var sourceAddress = new IntPtr(ComputeEffectiveAddress(sourceOffset)); var targetAddress = new IntPtr(target.LoadEffectiveAddress()); var lengthInBytes = new IntPtr(target.LengthInBytes); switch (targetBuffer.AcceleratorType) { case AcceleratorType.CPU: case AcceleratorType.Cuda: CudaException.ThrowIfFailed( CurrentAPI.MemcpyAsync( targetAddress, sourceAddress, lengthInBytes, stream)); break; default: throw new NotSupportedException( RuntimeErrorMessages.NotSupportedTargetAccelerator); } binding.Recover(); }
/// <summary cref="AcceleratorStream.Synchronize"/> public override void Synchronize() { using (var binding = Accelerator.BindScoped()) { CudaException.ThrowIfFailed( CudaAPI.Current.SynchronizeStream(streamPtr)); } }
/// <summary cref="MemoryBuffer.MemSetToZero(AcceleratorStream)"/> public override void MemSetToZero(AcceleratorStream stream) { var binding = Accelerator.BindScoped(); CudaAPI.Current.Memset(NativePtr, 0, new IntPtr(LengthInBytes), stream); binding.Recover(); }
/// <inheritdoc/> public unsafe override void Synchronize() { using var binding = Accelerator.BindScoped(); ReadOnlySpan <IntPtr> events = stackalloc[] { EventPtr }; CLException.ThrowIfFailed( CurrentAPI.WaitForEvents(events)); }
/// <summary cref="AcceleratorStream.Synchronize"/> public override void Synchronize() { var binding = Accelerator.BindScoped(); CudaException.ThrowIfFailed( CurrentAPI.SynchronizeStream(streamPtr)); binding.Recover(); }
/// <inheritdoc/> protected override ProfilingMarker AddProfilingMarkerInternal() { using var binding = Accelerator.BindScoped(); var profilingMarker = new CudaProfilingMarker(Accelerator); CudaException.ThrowIfFailed( CurrentAPI.RecordEvent(profilingMarker.EventPtr, StreamPtr)); return(profilingMarker); }
/// <summary cref="MemoryBuffer.MemSetToZero(AcceleratorStream)"/> public override void MemSetToZero(AcceleratorStream stream) { var binding = Accelerator.BindScoped(); CudaException.ThrowIfFailed( CurrentAPI.Memset( NativePtr, 0, new IntPtr(LengthInBytes), stream)); binding.Recover(); }
/// <inheritdoc/> public override TimeSpan MeasureFrom(ProfilingMarker marker) { using var binding = Accelerator.BindScoped(); return((marker is CPUProfilingMarker startMarker) ? Timestamp - startMarker.Timestamp : throw new ArgumentException( string.Format( RuntimeErrorMessages.InvalidProfilingMarker, GetType().Name, marker.GetType().Name), nameof(marker))); }
/// <inheritdoc/> public override TimeSpan MeasureFrom(ProfilingMarker marker) { using var binding = Accelerator.BindScoped(); if (!(marker is CLProfilingMarker startMarker)) { throw new ArgumentException( string.Format( RuntimeErrorMessages.InvalidProfilingMarker, GetType().Name, marker.GetType().Name), nameof(marker)); } // Wait for the markers to complete, then calculate the duration. startMarker.Synchronize(); Synchronize(); CLException.ThrowIfFailed( CurrentAPI.GetEventProfilingInfo( EventPtr, CLProfilingInfo.CL_PROFILING_COMMAND_END, out var endNanoseconds)); CLException.ThrowIfFailed( CurrentAPI.GetEventProfilingInfo( startMarker.EventPtr, CLProfilingInfo.CL_PROFILING_COMMAND_END, out var startNanoseconds)); // TimeSpan tracks time in ticks, where a single tick represents one hundred // nanoseconds, so we need to convert our elasped nanoseconds into ticks. // // NB: If the start time is later than the end time, reverse the calculation, // and then restore the correct signed result. bool swapped = false; if (endNanoseconds < startNanoseconds) { Utilities.Swap(ref startNanoseconds, ref endNanoseconds); swapped = true; } var elapsedNanoseconds = endNanoseconds - startNanoseconds; var ticks = (long)(elapsedNanoseconds / 100UL); if (swapped) { ticks = -ticks; } return(new TimeSpan(ticks)); }
/// <inheritdoc/> public override void Synchronize() { using var binding = Accelerator.BindScoped(); var errorStatus = CurrentAPI.QueryEvent(EventPtr); if (errorStatus == CudaError.CUDA_ERROR_NOT_READY) { CudaException.ThrowIfFailed(CurrentAPI.SynchronizeEvent(EventPtr)); } else { CudaException.ThrowIfFailed(errorStatus); } }
/// <inheritdoc/> protected internal override unsafe void MemSetInternal( AcceleratorStream stream, byte value, long offsetInBytes, long lengthInBytes) { var binding = Accelerator.BindScoped(); CudaException.ThrowIfFailed( CurrentAPI.Memset( new IntPtr(NativePtr.ToInt64() + offsetInBytes), value, new IntPtr(lengthInBytes), stream)); binding.Recover(); }
/// <inheritdoc/> protected internal override unsafe void MemSetInternal( AcceleratorStream stream, byte value, long offsetInBytes, long lengthInBytes) { var binding = Accelerator.BindScoped(); CLException.ThrowIfFailed( CurrentAPI.FillBuffer( ((CLStream)stream).CommandQueue, NativePtr, value, new IntPtr(offsetInBytes), new IntPtr(lengthInBytes))); binding.Recover(); }
/// <inheritdoc/> protected unsafe override ProfilingMarker AddProfilingMarkerInternal() { using var binding = Accelerator.BindScoped(); IntPtr *profilingEvent = stackalloc IntPtr[1]; CLException.ThrowIfFailed( CurrentAPI.EnqueueBarrierWithWaitList( queuePtr, Array.Empty <IntPtr>(), profilingEvent)); // WORKAROUND: The OpenCL event needs to be awaited now, otherwise // it does not contain the correct timing - it appears to have the timing // of whenever it gets awaited. var marker = new CLProfilingMarker(Accelerator, *profilingEvent); marker.Synchronize(); return(marker); }
/// <summary cref="MemoryBuffer{T, TIndex}.CopyFromView( /// AcceleratorStream, ArrayView{T}, LongIndex1)"/> protected internal unsafe override void CopyFromView( AcceleratorStream stream, ArrayView <T> source, LongIndex1 targetOffset) { var binding = Accelerator.BindScoped(); var clStream = (CLStream)stream; switch (source.AcceleratorType) { case AcceleratorType.CPU: CLException.ThrowIfFailed( CurrentAPI.WriteBuffer( clStream.CommandQueue, NativePtr, false, new IntPtr(targetOffset * ElementSize), new IntPtr(source.LengthInBytes), new IntPtr(source.LoadEffectiveAddress()))); break; case AcceleratorType.OpenCL: CLException.ThrowIfFailed( CurrentAPI.CopyBuffer( clStream.CommandQueue, source.Source.NativePtr, NativePtr, new IntPtr(source.Index * ElementSize), new IntPtr(targetOffset * ElementSize), new IntPtr(source.LengthInBytes))); break; default: throw new NotSupportedException( RuntimeErrorMessages.NotSupportedTargetAccelerator); } binding.Recover(); }
/// <summary cref="MemoryBuffer{T, TIndex}.CopyToView( /// AcceleratorStream, ArrayView{T}, LongIndex1)"/> protected internal unsafe override void CopyToView( AcceleratorStream stream, ArrayView <T> target, LongIndex1 sourceOffset) { var binding = Accelerator.BindScoped(); switch (target.AcceleratorType) { case AcceleratorType.CPU: CLException.ThrowIfFailed( CurrentAPI.ReadBuffer( stream, NativePtr, false, new IntPtr(sourceOffset * ElementSize), new IntPtr(target.LengthInBytes), new IntPtr(target.LoadEffectiveAddress()))); break; case AcceleratorType.OpenCL: CLException.ThrowIfFailed( CurrentAPI.CopyBuffer( stream, NativePtr, target.Source.NativePtr, new IntPtr(sourceOffset * ElementSize), new IntPtr(target.Index * ElementSize), new IntPtr(target.LengthInBytes))); break; default: throw new NotSupportedException( RuntimeErrorMessages.NotSupportedTargetAccelerator); } binding.Recover(); }
/// <inheritdoc/> protected unsafe override ProfilingMarker AddProfilingMarkerInternal() { using var binding = Accelerator.BindScoped(); return(new CPUProfilingMarker(Accelerator)); }