DATA

Image for post
Image for post

In Spark, the query execution plan is the entry point to understanding how the spark query is executed. This is very important, especially while debugging or investigating the execution in heavy workloads, or when the job takes a long time to run. Understanding the query plan is the first step one has to make towards optimizing the Spark code.

If we look at the query execution page, we see terms like Task, Stages, and Jobs. Let’s try to understand these terminologies before we move further.

A Task is a single operation applied to a single partition. Each task is executed…


SCALA

A guide to dealing with Dependency Hell in Sbt

Image for post
Image for post

Dependency hell is a negative situation that occurs when a software application is not able to access the additional programs it requires in order to work. In sofware development, additional programs that software requires are called dependencies. Sometimes known as JAR hell or classpath hell, dependency hell’s common outcomes include software performing abnormally, bugs, error messages when trying to run or install software, or the software ceasing to function.

How did we get here?

Say your application uses two libraries, lib-a and lib-b. Both these libraries use a shared library base-lib.

Initially, everything works smoothly.


AGILE

Image for post
Image for post

The Daily Scrum is an essential event for inspection and adaption, run daily to ensure that the Scrum is on its path to achieving the Sprint Goal. It helps in creating transparency, thus enabling inspection of the Scrum.

Typically, a good Scrum Team won’t need more than 10 to 15 minutes to inspect its progress towards the Sprint Goal. Given this short period, it is interesting to observe that a lot of strange personas unknowingly obstruct the smooth conduction of this event. Let us discuss a few of those personas and see how we can tackle them.

The Late Guy / The No-Show Guy


AGILE

Image for post
Image for post

The retrospective is a ceremony held at the end of each Sprint where team members collectively analyze how things went in order to improve the process for the next Sprint.

The purpose of the Sprint Retrospective is to

  • Inspect how the last Sprint went with regards to people, relationships, process, and tools
  • Identify and order the major items that went well and potential improvements
  • Create a plan for implementing improvements to the way the Scrum Team does its work.

It provides a formal opportunity to inspect and adapt the working of your scrum. …


CODING

Image for post
Image for post

Problem:

Suppose an array sorted in ascending order is rotated at some pivot unknown to you beforehand.

(i.e., [0,1,2,4,5,6,7] might become [4,5,6,7,0,1,2]).

Find the minimum element.

You may assume no duplicate exists in the array.

Example 1:

Input: [3,4,5,1,2] 
Output: 1

Example 2:

Input: [4,5,6,7,0,1,2]
Output: 0

My Solution:

def findMin(nums: List[int]) -> int:
left, right = 0, len(nums) - 1
while nums[left] > nums[right]:
middle = int((left + right)/2)
if nums[middle] < nums[right]:
right = middle
else:
left = middle + 1
return nums[left]

Explanation:

We use a modified version of binary search, to find the “Inflection Point”.

The…


CODING

Image for post
Image for post

Problem:

A conveyor belt has packages that must be shipped from one port to another within D days.

The i-th package on the conveyor belt has a weight of weights[i]. Each day, we load the ship with packages on the conveyor belt (in the order given by weights). We may not load more weight than the maximum weight capacity of the ship.

Return the least weight capacity of the ship that will result in all the packages on the conveyor belt being shipped within D days.

Example 1:

Input: weights = [1,2,3,4,5,6,7,8,9,10], D = 5 Output: 15 Explanation: A ship…


CODING

Image for post
Image for post

Problem:

Design a stack that supports push, pop, top, and retrieving the minimum element in constant time.

  • push(x) — Push element x onto stack.
  • pop() — Removes the element on top of the stack.
  • top() — Get the top element.
  • getMin() — Retrieve the minimum element in the stack.

Example 1:

Input
["MinStack","push","push","push","getMin","pop","top","getMin"]
[[],[-2],[0],[-3],[],[],[],[]]
Output
[null,null,null,null,-3,null,0,-2]
Explanation
MinStack minStack = new MinStack();
minStack.push(-2);
minStack.push(0);
minStack.push(-3);
minStack.getMin(); // return -3
minStack.pop();
minStack.top(); // return 0
minStack.getMin(); // return -2

Constraints:

  • Methods pop, top and getMin operations will always be called on non-empty stacks.

My Solution:

class MinStack:
def __init__(self):
self.topNode …


CODING

Image for post
Image for post

Problem:

You are given two integer arrays nums1 and nums2 sorted in ascending order and an integer k.

Define a pair (u,v) which consists of one element from the first array and one element from the second array.

Find the k pairs (u1,v1),(u2,v2) …(uk,vk) with the smallest sums.

Example 1:

Input: nums1 = [1,7,11], nums2 = [2,4,6], k = 3
Output: [[1,2],[1,4],[1,6]]
Explanation: The first 3 pairs are returned from the sequence:
[1,2],[1,4],[1,6],[7,2],[7,4],[11,2],[7,6],[11,4],[11,6]

Example 2:

Input: nums1 = [1,1,2], nums2 = [1,2,3], k = 2
Output: [1,1],[1,1]
Explanation: The first 2 pairs are returned from the sequence:
[1,1],[1,1],[1,2],[2,1],[1,2],[2,2],[1,3],[1,3],[2,3]

Example 3:


DATA

Image for post
Image for post

So, yesterday I tried to set up Airflow for a pet project.

My project basically needed Airflow to run a job every 5 mins to pull data from various sources, transform the data perhaps and write to an Elasticsearch index.

I wanted to dockerize it, so I can deploy the entire the set up easily in any machine. This way I can share my project with anyone, and they can set it up in their machine and get started.

That was the goal.

Before I start, let me brief you on some key concepts/ Terminologies in Airflow.

Terminology

A DAG is…


CODING

Image for post
Image for post

Problem:

Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.

Example 1:

Input: [1,3,4,2,2]
Output: 2

Example 2:

Input: [3,1,3,4,2]
Output: 3

Note:

  1. You must not modify the array (assume the array is read only).
  2. You must use only constant, O(1) extra space.
  3. Your runtime complexity should be less than O(n2).
  4. There is only one duplicate number in the array, but it could be repeated more than once.

My Solution:

Anjana Sudhir

Senior Software Engineer & Scrum Master @ Agoda (Booking Holdings Inc.)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store