DataScience requires an interdisciplinary set of skills, from handling databases, to running statistical model, to setting up business cases and programming itself. More often than not technical interviews for data-scientists assess more the knowledge of specific data manipulation APIs such as pandas, sklearn or spark, rather than a programming way of thinking.
While I think that a knowledge of the more “applied” APIs is something that should be tested when hiring data-scientist, so is the knowledge of more traditional programming.
String reversal questions can provide some information as to how well, certain candidates have been with dealing with text in python and at handling basic operations.
Question: Reverse the String “ “the fox jumps over the lazy dog”
Answer:
Assessment:
Question: identity all words that are palindromes in the following sentence “Lol, this is a gag, I didn’t laugh so much in a long time”
Answer:
Assessment:
FizzBuzz is a traditional programming screening question, that allows to test if a candidate can think through a problem that is not a simple if else statement. The approach that they take can also shed some light to their understanding of the language.
Question: Write a program that prints the number for 1 to 50, for number multiple of 2 print fizz instead of a number, for numbers multiple of 3 print buzz, for numbers which are multiple of both 2 and 3 fizzbuzz.
Answer:
Assessment:
First finding of duplicate word allows to identity if candidates know the basic of text processing in python as well as are able to handle some basic data structure.
Question: Given a string find the first duplicate word, example string: “this is just a wonder, wonder why do I have this in mind”
Answer:
Assessment:
Question: What if we wanted to find the first word with more than 2 duplicates in a string?
Answer:
Assessment:
Some quick fire questions can also be asked to test the general knowledge of the python language.
Question: Replicate the sum for any number of variables, eg sum(1,2,3,4,5..)
Answer:
Assessment:
Questions around the Fibonacci series is a classic of programming interviews and candidates should in general be at least familiar with them. They allow to test recursive thinking.
Question: Fibonacci sequences are defined as follow:
Write a function that gives the sum of all fibonacci numbers from 0 to n.
Answer:
Assessment:
These questions are just meant to be a first screener for data-scientist and should be combined with statistical and data manipulation types of questions. They are meant to give a quick glimpse on whether a candidate has the basic minimum knowledge to go through a full interview rounds.
More advanced programming questions for Python would tend to cover the use of generators, decorators, cython or the efficient use of libraries such as pandas/numpy.