Data Driven: Gearing up Python for High-Speed HDF Adventures
08-25, 13:40–14:20 (Asia/Kuala_Lumpur), JC 1

In today’s world of data-intensive technologies, engineers are constantly challenged to manage vast and complex datasets effectively. This talk zeroes in on enhancing Hierarchical Data Format (HDF) operations at the hardware level to achieve unprecedented speed and efficiency in processing dense and sparse datasets.

Participants will be introduced to the HDF implementation strategy specifically designed for Python. We will delve into a variety of approaches essential for applications that generate massive data streams. By harnessing the power of multiprocessing, advanced hardware techniques, and optimized data schemes, we demonstrate substantial improvements in data handling efficiency.

The session will outline strategies for implementing HDF in two distinct case studies: Distributed Acoustic Sensing (DAS), which is characterized by its dense data streams, and corrosion detection in 3D point clouds, exemplifying the challenges of sparse data management. These strategies not only accelerate processing but also significantly boost scalability and responsiveness—key components for effective data analysis and visualization in Python environments.


Participants will leave with actionable insights into:
- When to use HDF for data storage
- Efficient HDF implementations that cater to both dense and sparse data scenarios.
- Techniques to harness the power of multiprocessing and exploit hardware for HDF in Python.
- Practical examples from case studies

Khairulmizam is a Computer engineering researcher at Universiti Putra Malaysia, a hardware tinkerer, a Python and C programmer, and a Linux user.