Skip links

Main navigation

Loading Events

« All Events

HPC and Big Data Analytics using Comet

April 4, 2017 @ 9:00 am - 4:30 pm


UCLA-IDRE and SDSC are organizing the workshop “HPC and Big Data Analytics using Comet” in order to introduce the Comet supercomputer and its usage to the researchers at UCLA.

Comet, a petascale supercomputer at San Diego Supercomputing Center (SDSC), is one of the key resources within the NSF’s XSEDE (Extreme Science and Engineering Discovery Environment) program. It provides “free” computer time via XSEDE portal to the researchers across USA. The participants will be able to get hands-on experience on Comet during the workshop’s different sessions.

The RSVP link and the agenda for the workshop are as follows:



9:00 AM – 9:10 AM: Introduction & Welcome

9:10 AM – 10:00 AM: Comet – SDSC’s 2 PetaFLOPS HPC Resource

  • Architecture, queue/partition info, software stack
  • Examples for compute, shared, gpu, and gpu-shared partitions
  • Hands-on on Comet to help prep for next sessions which will use Comet

10:00 AM – 10:30 AM: Science Gateways

10:30 AM – 10:40 AM: Short break

10:40 AM- 12:00 PM:  Introduction to Hadoop on Comet

  • Overview of running Hadoop within scheduler frameworks (using myHadoop)
  • Demonstration/Hands on of Hadoop cluster spin up, interactive usage
  • New technologies/approaches like RDMA-Hadoop and hands on with RDMA-Hadoop

12 PM – 1 PM: Lunch (provided by IDRE)

1:00 PM – 2:00 PM: Data Analytics and Data Mining 

  • R and parallel execution of R
  • Data mining/machine learning

2:00 PM- 3:00 PM: Python for Scientific Computing

  • How to run Jupyter notebook on Comet
  • Use IPython Parallel for distributed computation
  • Easy multithreading and distributed computing with dask

3:00 PM-3:10 PM: Short break

3:05 PM – 4:30 PM: Spark for Scientific Computing

  • Overview of the capabilities of Spark and how they can be leveraged to solve problems in Scientific Computing
  • Hands-on introduction to Spark, from batch and interactive usage on Comet to running a sample map/reduce example in Python
  • Two key libraries in the Spark ecosystem: Spark SQL, a general purpose query engine that can interface to SQL databases or JSON files and Spark MLlib, a scalable Machine Learning library

4:30 PM: Wrap up


April 4, 2017
9:00 am - 4:30 pm
Event Categories:
, ,


5628 Math Science Building, UCLA


T V Singh


Out of stock. Please contact with the name of the event you'd like to be on the waiting list for. HPC and Big Data Analytics using Comet ‎

Please fill in all required fields

Events List Navigation