Saving a sinking ship PyCon MY

Saving a sinking ship
.ical

08-24, 13:45–14:25 (Asia/Kuala_Lumpur), JC 1

After joining a new company, Ivan was assigned to work with a Python task server on a legacy project, the project provides raw data from various sources to multiple upstream, things were broken and unstable, how did it end up?

Technical debts, lack of visibility, out-of-memory issues, spaghetti code, inconsistency in logging and error handling, etc. are what Ivan was facing. Listen to him how a team of data DevOps and Data Engineers dealt with these issues.

Ivan worked on a project dealing with Celery with multiple upstream channels, many things were unstable, at the same time he and the team encountered issues due to technical debt, unstable upstream dealing with lots of firefighting, some of the issues they faced:

multiple steps to deploy to production / manual deployment
no visibility on tasks run
out of memory issues
dealing with data larger than memory
spaghetti code due to append-only only copy-pasting
inconsistency in logging and error handling
dealing with high traffic udp output

In a team of data devops and data engineers, how did they go and deal with these issues? Saving the sinking ship!

Ivan Tham

Ivan likes to volunteer in communities, like Rust Malaysia and PyCon MY. He used to be a helix maintainer.

Saving a sinking ship .ical 08-24, 13:45–14:25 (Asia/Kuala_Lumpur), JC 1

Saving a sinking ship
.ical

08-24, 13:45–14:25 (Asia/Kuala_Lumpur), JC 1