Build and run Docker containers leveraging NVIDIA GPUs
-
Updated
Sep 16, 2020 - Makefile
{{ message }}
Build and run Docker containers leveraging NVIDIA GPUs
kaldi-asr/kaldi is the official location of the Kaldi project.
A flexible framework of neural networks for deep learning
Problem:
catboost version: 0.23.2
Operating System: all
Tutorial: https://github.com/catboost/tutorials/blob/master/custom_loss/custom_metric_tutorial.md
Impossible to use custom metric (С++).
Code example
from catboost import CatBoost
train_data = [[1, 4, 5, 6],
Modern C++ Parallel Task Programming
Current implementation of join can be improved by performing the operation in a single call to the backend kernel instead of multiple calls.
This is a fairly easy kernel and may be a good issue for someone getting to know CUDA/ArrayFire internals. Ping me if you want additional info.
Improve readability of thread id based branches by giving them more descriptive names.
if (!t) // is actually a t == 0https://github.com/rapidsai/cudf/blob/57ef76927373d7260b6a0eda781e59a4c563d36e/cpp/src/io/statistics/column_stats.cu#L285
Is actually a lane_id == 0
As demonstrated in rapidsai/cudf#6241 (comment), pr
HIP: C++ Heterogeneous-Compute Interface for Portability
We do not have documentation specifying the different treelite Operator values that FIL supports. (https://github.com/dmlc/treelite/blob/46c8390aed4491ea97a017d447f921efef9f03ef/include/treelite/base.h#L40)
Report needed documentation
https://github.com/rapidsai/cuml/blob/branch-0.15/cpp/test/sg/fil_test.cu
There are multiple places in the fil_test.cu file
I often use -v just to see that something is going on, but a progress bar (enabled by default) would serve the same purpose and be more concise.
We can just factor out the code from futhark bench for this.
A community run, 5-day PyTorch Deep Learning Bootcamp
ThunderSVM: A Fast SVM Library on GPUs and CPUs
Thank you for this fantastic work!
Could it be possible the fit_transform() method returns the KL divergence of the run?
Thx!
CUDA integration for Python, plus shiny features
an implementation of 3D Ken Burns Effect from a Single Image using PyTorch
Performance-optimized wheels for TensorFlow (SSE, AVX, FMA, XLA, MPI)
Fast Clojure Matrix Library
CUDA Templates for Linear Algebra Subroutines
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Add a description, image, and links to the cuda topic page so that developers can more easily learn about it.
To associate your repository with the cuda topic, visit your repo's landing page and select "manage topics."
Reporting a bug