Welcome to my personal page

My name is Oleksii Kuchaiev.

I currently work as Director of Applied Research on AI Applications team at NVIDIA. My main research interests are model alignment and parameter efficient customization of foundation models.

I received my Ph.D. (2010) an and M.Sc. (2009) degrees in Information and Computer Science from University of California, Irvine. My advisor was: Prof. Natasa Przulj. I also go M.Sc. (2007) and B.Sc. (2005) in Applied Mathematics from National Taras Shevchenko University of Kiev, Ukraine.

I enjoy almost all types of outdoor activities, especially: surfing, riding motorcycles, sailing, and hiking.

Posts:

Mar 1, 2020 - How to get a machine learning internship

I am frequently asked this question - “How to get a machine learning / deep learning internship?” Because the answer I give is somewhat standard and rather long I’ve decided to write this post. I hope that it will help someone I don’t know to get an internship in this exciting field.

Why you might find my advice helpful?

I have been working in machine learning related roles for several years at Microsoft, Apple and, currently, NVIDIA. At Apple and NVIDIA combined, I have interviewed hundreds of candidates and have mentored several interns. Some students I know even got ML internships by following the advice below :)

Why you might find my advice not so useful?

Like I mentioned above, my experience is at the large tech companies. If you are looking for a startup experience, I guess, some things will be different. However, I’m pretty sure that the fundamentals remain the same. Also, I am writing this in 2020 - so if you are reading this several years later some things likely have changed.

Fundamentals

The fundamental qualifications you will need are:

  1. Being a strong engineer/programmer
  2. Understanding of machine learning / deep learning

Point number (1) above is super important and must not be overlooked. You must have strong computer science fundamentals and hands-on experience with Python and, preferably, C++ as well.

Formally, internships (not just ML/DL) will typically have some requirements such that you should be a university student before and after the internship.

How to prepare

You should make yourself a stronger programmer and improve your understanding of deep learning.

Start by making the best use of the courses and resources offered at your school/university. For example, does your program offer the following fundamentals: linear algebra, calculus, optimization, algorithms/data structures, probability, statistics, math logic, programming languages, etc.? If yes - do take those courses and make sure you are learning a lot from them. If you are luckier, and your program has deep learning courses, of course, take them as well. On our team (applied research), for example, we look for people who took at least 2+ deep learning specific university courses (typically fundamentals plus some more focused courses, such as NLP).

An incredibly useful resource these days are MOOCs. When I look at someone’s resume I actually don’t care where they took their deep learning courses: at Stanford/CMU/MIT or on Coursera/Udacity. A lot of my colleagues have similar opinion.

Here is a tiny sample list of MOOC courses which will make you a stronger candidate for ML/DL internship:

You have to take those courses/specializations “for real” - do all the homework and, ideally, earn their certifications.

There are a lot of great blogpost by various people and teams which are can be very useful while preparing for the interview. For example, have a look at Sebatian Ruder’s blogposts on Optimization and NLP ImageNet Moment.

Like I mentioned above, you also have to be a good programmer. If you don’t have strong CS background - MOOCs can help you there as well! There are plenty of courses on algorithms, data structure and programming languages (take Python). However, the most important thing you should do is practice solving programming interview questions. Practice solving problems on sites like GeeksForGeek and LeetCode. You should understand what time and space complexity is and when and how to use particular data structures. You will almost certainly get a coding question during your interview. So make sure you are prepared. I think nowadays “puzzle” or “trick” questions are asked less often and the question you’ll get will most likely test if you understand how and when to use certain data-structures and/or some algorithmic techniques (greedy method, sorting, dynamic programming). A former colleague of mine, Chip Huyen is working on a book about machine learning interviews - check it out.

It is very important to practice. Pick a deep learning framework - I recommend PyTorch but Tensorflow 2.x might be a good option as well and start training and implementing models. To train anything more serious than a tiny toy model, you will need access to the specialized hardware (consumer CPU’s aren’t great at deep learning in general).

  • Google Colaboratory is an excellent resource. It lets you play with GPUs and TPUs right from your browser! It is free and they recently started offering a paid version which gives you more reliable access to GPUs.
  • The most flexible option for development is to have a NVIDIA GPU in your desktop (disclaimer: I work at NVIDIA). A good gaming card (Pascal or, better, Turing-based) will get you pretty far.
  • AWS or Azure with GPU instances is another option. But those are not very convenient for hacking around - those are great choices to train already debugged model and for production.

How to apply

Three best routes (by far) are:

  1. To be recommended by someone who already works there.
  2. Have your advisor recommend you to an industry lab with which he has some kind of collaboration.
  3. Apply at the company’s booth during one of the top deep learning conference.

If you are lucky to be a student at one of the top US universities, another good route is to attend and apply during a career fair at your school (this is how I got into Microsoft). A lot of companies attend career fairs of top universities every year, if not every quarter.

Another option is to simply find a job posting on company’s website and apply there. Obviously, this is the route with the lowest probability of getting an interview. It is still realistic but do have a look at the “How to stand out from the crowd” section below.

How to stand out from the crowd

There are several great ways to stand out from the crowd:

  1. Share your projects on GitHub. Something as simple as re-implementing common models/algorithms or class projects you did is already good. A personal hobby project is better. Try contributing to bigger open-source projects as well. We had interns whom we contacted first after seeing their PRs to one of our projects. When I see someone’s resume, I always look for a GitHub profile link and visit it.

  2. Participate in programming and data science competitions. When you are a university student it is a great time to try yourself at some programming, math or machine learning contests. ICPC, TopCoder Open, Kaggle, ImagineCup, etc. There might be some specific to your school or field of research - do check them out! You don’t have to be the best in the world in any of them. Any achievement is worth mentioning on your resume. Olympiads in Math and Physics are also a good way to stand out. Also, as a former Imagine Cup competitor, I’d like to add that preparation and experience of competition itself is very much worth the effort.

  3. Publications. This is more of a requirement if you are a Ph.D. student. Having published a deep learning related paper is certainly a plus. Especially, at top venues such as NeurIPS, ICLR, ICML, CVPR, etc. Even if it is a workshop poster - it is certainly a plus, especially if it is related to the area of research of the lab you are applying to.

The process

The process for internship is pretty simple (with small variations from company to company). After you applied, a recruiter may contact you for an “exploratory” call. After that, there are typically two technical phone screens during which you will code and answer machine learning related questions. The process may vary a little if you’ve applied at a conference or career fair in which case an engineer might do a technical screen with you sooner and/or in person. If you are a Ph.D. student, you might be asked to come in and do a presentation about your research.

Some myths about getting ML/DL internships

One has to be a Ph.D. student to get a deep learning internship

In many cases being a graduate student is preferable and, for some roles (especially at research labs), being a Ph.D. student can be a requirement. However, this is not a requirement across the field and there are many (even research) teams who will be happy to welcome a strong undergrad for an internship. On our team we had undergrad interns who during their internship produced and published research at top conferences.

You have to be a student at a top CS university

This certainly helps, but it is absolutely not a requirement. Many (myself included) will accept MOOC-based deep learning courses in lieu of similar courses at top school.

You have to be in US to apply to US-based internship

This is a very common misconception. You typically can apply for internships in top companies like NVIDIA, Apple, Google, Microsoft, etc. from anywhere in the world, even if the job description doesn’t explicitly mentions that international candidates can apply. Also, keep in mind, that big companies might have offices with interesting internship positions outside US as well.

Your interviewer wants you to fail

This is a weird one. On the contrary - your interviewer wants you to succeed. And it is quite disappointing when the candidate is unable to answer questions.

TL;DR;

Take Deep Learning and Machine Learning courses at your school and MOOCs such as Coursera. Have a GitHub presence and contribute to (or create) open-source projects. Participate in programming and machine learning competitions. Practice solving coding problems (nothing too fancy). Apply!

Feb 21, 2020 - My interview on mentorstudents.org

There is a very interesting project - “mentorstudents.org”. It tries to help students get started on their career paths by connecting them with and interviewing different people who are already in the industry and is willing to share some advice.

I’m happy to have been interviewed by them as well and here is a link to my interview. Btw, the results of automatic speech recognition (ASR) on my interview are hilarious :). I especially like that “my major was a brightness magics”. This is just a reminder that ASR still have a long way to go especially for heavily accented speech (like mine).

Dec 13, 2019 - NeMo talk at MLSys@NeurIPS Workshop

Recently, I gave a short talk about NeMo at MLSys workshop at NeurIPS 2019 in Vancouver, Canada. NeMo is the project I actively work on at NVIDIA. It is open-source and we welcome external contributors.