Photography

Getting Answers Faster: NVIDIA and Open-Source Ecosystem Come Together to Accelerate Data Science

No matter the industry, data science has become a universal toolkit for businesses. Data analytics and machine learning give organizations insights and answers that shape their day-to-day actions and future plans. Being data-driven has become essential to lead any industry.

While the world’s data doubles each year, CPU computing has hit a brick wall with the end of Moore’s law. For this reason, scientific computing and deep learning have turned to NVIDIA GPU acceleration. Data analytics and machine learning haven’t yet tapped into the GPU as systematically. That’s changing.

RAPIDS, launched today at GTC Europe, gives data scientists for the first time a robust platform for GPU-accelerated data science: analytics, machine learning and, soon, data visualization. And what’s more, the libraries are open-source, built with the support of open-source contributors and available immediately at www.RAPIDS.ai.

Initial benchmarks show game-changing 50x speedups with RAPIDS running on the NVIDIA DGX-2 AI supercomputer, compared with CPU-only systems, reducing experiment iteration from hours to minutes.

By the Community, for the Community

With a suite of CUDA-integrated software tools, RAPIDS gives developers new plumbing under the foundation of their data science workflows.

To make this happen, NVIDIA engineers and open-source Python contributors collaborated for two years. Building on key open-source projects including Apache Arrow, Pandas and scikit-learn, RAPIDS connects the data science ecosystem by bringing together popular capabilities from multiple libraries and adding the power of GPU acceleration.

RAPIDS will also integrate with Apache Spark, the leading open-source data science framework for data centers, used by more than 1,000 organizations.

A data science workshop following the GTC Europe keynote will feature a panel with luminaries of the open-source community — Travis Oliphant and Peter Wang, co-founders of Anaconda, as well as Wes McKinney, founder and creator of Apache Arrow and the Pandas software library, and a contributor to RAPIDS.

These pioneers will discuss the potential for RAPIDS for GPU-accelerated data science before an audience of developers, researchers and business leaders. At the workshop, Databricks, a company founded by the creators of Spark, will present on unifying data management and machine learning tools using GPUs.

It was a natural step for NVIDIA, as the creator of CUDA, to develop the first complete solution that integrates Python data science libraries with CUDA at the kernel level. By keeping it open source, we welcome further growth and contributions from other developers in the ecosystem.

This community is vast — tens of millions of downloads occur annually of the core data science libraries via the package manager Conda. Open-source development makes it easier for data scientists to rapidly adopt RAPIDS and maintain the flexibility to modify and customize tools for their applications.

NVIDIA in recent years has made diverse contributions to the AI open-source community with libraries like the Material Definition Language SDK, the NCCL software module for communication between GPUs and the NVIDIA DIGITS deep learning application.

There are 120 repositories on our GitHub page, including research algorithms, the CUTLASS library for matrix multiplication in CUDA and NVcaffe, our fork of the Caffe deep learning framework. And we’ll continue to contribute to RAPIDS alongside the open-source community, supporting data scientists as they conduct efficient, granular analysis.

Delivering Rapid Answers to Data Science Questions

Data scientists, and the insights they extract, are in high demand. But when relying on CPU systems, there’s always been a limit on how fast they can crunch data.

Depending on the size of their datasets, scientists may have a long wait for results from their machine learning models. And some may aggregate or simplify their data, sacrificing granularity for faster results.

With the adoption of RAPIDS and GPUs, data scientists can ramp up iteration and testing, providing more accurate predictions that improve business outcomes. Typical training times can shrink from days to hours, or from hours to minutes.

In the retail industry, it could allow grocery chains to estimate the optimal amount of fresh fruit to stock in each store location. For banks, GPU-accelerated insights could alert lenders about which homeowners are at risk of defaulting on a mortgage.

Access to the RAPIDS open-source suite of libraries is immediately available at www.RAPIDS.ai, where the code is being released under the Apache license. Containerized versions of RAPIDS will be available this week on the NVIDIA GPU Cloud container registry.

For more updates about RAPIDS, follow @rapidsai.

The post Getting Answers Faster: NVIDIA and Open-Source Ecosystem Come Together to Accelerate Data Science appeared first on The Official NVIDIA Blog.


by Clement Farabet via The Official NVIDIA Blog
Getting Answers Faster: NVIDIA and Open-Source Ecosystem Come Together to Accelerate Data Science Getting Answers Faster: NVIDIA and Open-Source Ecosystem Come Together to Accelerate Data Science Reviewed by Ossama Hashim on October 10, 2018 Rating: 5

14 comments:

  1. Going out the school jobless was agonizing and to see myself no place was unfortunate.data science course in pune

    ReplyDelete
  2. Well, The information which you posted here is very helpful & it is very useful for the needy like me.., Wonderful information you posted here. Thank you so much for helping me out to find the Data analytics course in MumbaiOrganisations and introducing reputed stalwarts in the industry dealing with data analyzing & assorting it in a structured and precise manner. Keep up the good work. Looking forward to view more from you.

    ReplyDelete
  3. This knowledge.Excellently written article, if only all bloggers offered the same level of content as you, the internet would be a much better place. Please keep it up.

    best data science courses

    ReplyDelete
  4. Actually I read it yesterday but I had some thoughts about it and today I wanted to read it again because it is very well written.
    Data training

    ReplyDelete
  5. Attend The Data Science Training in Bangalore From ExcelR. Practical Data Science Training in Bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses in Bangalore.
    Data Science training in Bangalore

    ReplyDelete
  6. Such a very useful article. I have learn some new information.thanks for sharing.
    Data Science Course in Mumbai

    ReplyDelete
  7. Well, the most on top staying topic is Data Analytics. Data Analytics is one of the most promising technique in the growing world. I would like to add Data Analytics training to the preference list. Out of all, DData Analytics Course in Mumbai is making a huge difference all across the country. Thank you so much for showing your work and thank you so much for this wonderful article.

    ReplyDelete

  8. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. I would like to state about something which creates curiosity in knowing more about it. It is a part of our daily routine life which we usually don`t notice in all the things which turns the dreams in to real experiences. Back from the ages, we have been growing and world is evolving at a pace lying on the shoulder of technology."
    best data science institute in hyderabad "
    will be a great piece added to the term technology. Cheer for more ideas & innovation which are part of evolution.

    ReplyDelete
  9. Just the way I have expected. Your website really is interesting.
    data scientist training in hyderabad

    ReplyDelete
  10. Such a very useful article. Very interesting to read this article. I would like to thank you for the efforts you had made for writing this awesome article. Data Science course in Hyderabad

    ReplyDelete
  11. Really impressed! Everything is very open and very clear clarification of issues. It contains truly facts. Your website is very valuable. Thanks for sharing.
    ExcelR data analytics courses

    ReplyDelete
  12. It's late finding this act. At least, it's a thing to be familiar with that there are such events exist. I agree with your Blog and I will be back to inspect it more in the future so please keep up your act.
    data analytics courses

    ReplyDelete
  13. Search and Apply to Jobs from leading placement consultants online at Eswar Solutions.
    HR Management Company in Bangalore

    ReplyDelete

Powered by Blogger.