Working with NVIDIA BlueField-2 DPUI have recently been trying to set up a BlueField-2 DPU and found that not only is documentation scattered and sometimes incomplete, the…Aug 23Aug 23
Data Pipeline Backfill on AirflowI recently worked on backfilling a large amount of historical data (~a few years’ worth) and ran into some Airflow gotchas. Airflow is an…Mar 3, 2023Mar 3, 2023
Data Pipelining ConceptsI’ve been recently thinking about data pipeline designs. Here are some key concepts or “gotchas” that are frequently not handled…May 8, 2022May 8, 2022
Trying out OpenAI Beta with Poetry Writing AppI signed up for the OpenAI API Beta when GPT-3 was all the rage last year. Having finally gotten out of the waitlist a month back, I…Jun 19, 2021Jun 19, 2021
Kubernetes pods terminating with error after job is doneI was debugging a behaviour that was not present before we migrated Spark from a custom-provisioned cluster to a Kubernetes cluster. After…Apr 20, 2021Apr 20, 2021
Using AWS temporary credentials with Hadoop S3 ConnectorA few days ago I was trying to get a Spark app to access another AWS account by calling the AWS STS AssumeRole API. Let’s say there are…Mar 17, 2021Mar 17, 2021