- This event has passed.
High-Performance Data Science in Python (2) DataFrame Game
May 12, 2021 @ 10:00 am - 12:00 pm
Workshop will be conducted in Zoom in PST time. Please REGISTER in advance for this lecture.
While Python becomes the most popular programming language since 2019, data scientists often have a few common complaints about its slow speed and the limited capabilities of handling the big data scenarios. In this workshop series, we will present an extensive discussion on how to improve the performance of Python in data science by looking under the hood of its language/libraries and using the technologies to make Python a practical solution for the high-performance big data analytics.
In the second session, we will focus on how to load/process the super big dataset in Python using a single machine and comparing the dataframe implementations from Pandas, Modin, Pandarallel, Dask and Vaex etc. Although no specific prerequisite is required to attend the talk, having programming experience in Python’s numpy and Pandas packages will be helpful to fully understand the lecture content.
After registering, you will receive a confirmation email containing information about joining the meeting.
If you have any further questions regarding the workshop, please contact instructor Qiyang Hu.