FAQ: Data Science Bootcamps

For the past two years, I’ve been mentoring aspiring data scientists through Springboard’s Data Science Career Track Prep course. I myself graduated from Springboard’s Career Track program back in 2017, so I know firsthand how intimidating it can be to try and rebrand yourself as a data scientist without a traditional degree.

I’ve noticed my mentees often have the same questions as their peers concerning the data science industry and the bootcamp track they’ve chosen. I’m compiling a list of these FAQ’s for all aspiring data scientists who are considering a bootcamp or just want my take on breaking into this field.

What advice do you have about searching for my first job?

I highly recommend that your first job as a data scientist (or data analyst) is at an organization large enough to already have a data team in place. You’ll need a mentor—either informally attached or formally assigned—in those first few years, and that sexy itty-bitty start-up you found on AngelList isn’t going to provide that to you.

I also advise that you read the job descriptions very closely. The title of “data scientist” can carry some prestige so companies will slap it on roles that aren’t actually responsible for any real data science.

Even if you think the job description outlines the kind of role you’re interested in, ask lots of probing questions in the interview around the day-to-day responsibilities of the job and the structure of the team. You need to understand what you’re agreeing to.

Some suggested questions:

    • What do you expect someone in this role to achieve in the first 30 days? The first 90?

    • What attributes do you think are necessary for someone to be successful in this role?

    • How do you measure this role’s impact on the organization?

    • How large is the team and what is the seniority level of the team members?

    • Is your codebase in R or Python?

Don’t be afraid to really dig deep in these conversations. Companies appreciate candidates who ask thoughtful questions about the role, and demonstrating that you’re a results-oriented go-getter will separate you from the pack.

What’s the work/life balance like for data scientists?

I think I can speak for most data scientists here when I say work/life balance is pretty good. Of course, situations can vary.

But I believe data scientists occupy a sweet spot in the tech industry for two reasons:

    1. We don’t usually own production code. This means that, unlike software engineers, we aren’t on call to fix an issue at 2 am.

    2. Our projects tend to be long-term and more research-oriented so stakeholders typically don’t expect results within a few days. There are definitely exceptions to this but at the very least, working on a tight deadline is not the norm.

That being said, occasionally I do work on the weekends due to a long-running data munging job or model training. But those decisions are entirely my own, and overall I think the data scientist role is pretty cushy.

Besides Python, what skills should I focus on acquiring?

Contrary to what a lot of the internet would have you believe, I really don’t think there’s a standard skillset for data scientists outside of basic programming and statistics.

In reality, data scientists exist along a spectrum.

On one end, you’ve got the Builders. These kinds of data scientists have strong coding skills and add value to an organization by creating or standardizing data pipelines, building in-house analytical tools, and/or ensuring reproducibility of projects through good software engineering principles. I associate with this end of the spectrum.

On the other end, we have the Analysts. These data scientists have a firm grasp of advanced statistical methods or other mathematical subjects. They spend most of their time exploring complex feature creation techniques such as signal processing, analyzing the results of A/B experiments, and applying the latest cutting-edge algorithms to a company’s data. They usually have an advanced degree.

It is very rare to find someone to truly excels at both ends of the spectrum. Early on in your career, you might be a generalist without especially strong skills on either end. But as you progress, I’d recommend specializing on one end or the other. This kind of differentiating factor is important to build your personal brand as a data scientist.

Don’t I need a PhD to become a data scientist?

It depends.

If your dream is to devise algorithms that guide self-driving cars, then yes, you’ll need a PhD. On the other hand, if you’re just excited about data in general and impacting organizations through predictive analytics, then an advanced degree is not necessary.

Sure, a quick scroll on LinkedIn will show a lot of data scientist job postings that claim to require at least a master’s degree. But there are still plenty of companies that are open to candidates with only a bachelor’s degree. Just keep searching. Or better yet, mine your network for a referral. This is hands-down the best way to land a job.

Another route is to get your foot in the door through a data analyst position. These roles rarely require an advanced degree but you’ll often work closely with data scientists and gain valuable experience on a data team.

Many companies will also assign smaller data science tasks to analysts as a way to free up data scientists’ time to work on long-term projects. Leverage your time as an analyst to then apply for a data scientist role.

I’m considering applying for a master’s program instead of going through the bootcamp program. What do you suggest?

I’m hesitant about data science master’s programs.

When you commit to a master’s degree, you’re delaying your entry into the job market by two years. The field of data science is changing so rapidly that two years is a long time. The skills hiring managers are looking for and the tools data teams use may have shifted significantly in that time.

My personal opinion is that the best way to optimize your learning curve is to gather as much real-world experience as possible. That means getting a job in the data field ASAP.

Additionally, the democratization of education over the past decade now means that high-quality classes are available online for a tiny fraction of the price traditional universities charge their students. I’m a huge fan of platforms like Coursera and MIT OCW. My suggested courses from these sites can be found here.

At the end of the day, companies are just relying on these fancy degrees as a proxy for your competence to do the job. If you can show that competence in other ways, through job experience or a project portfolio, shelling out tens of thousands of dollars and multiple years of your life is not necessary.

I’m planning on working full-time while enrolled in the bootcamp program. Do you think that’s doable?

Yes.

But it will require discipline and long unbroken stretches of time. You won’t be able to make any meaningful progress if you’re only carving out time to work on the bootcamp from 5-6 pm every weeknight. Your time will be much better spent if you can set aside an entire afternoon (or better yet, an entire weekend) to truly engage with the material.

Committing to the bootcamp requires a re-prioritization of your life. As Henry Ford said, “if you always do what you’ve always done, you’ll always get what you’ve always got.”

What advice do you have about the capstone projects?

Find data first.

I recommend Google’s dataset search engine or AWS’s open data registry. Municipal governments also do a surprisingly good job of uploading and updating data on topics ranging from employee salaries to pothole complaints.

Once you’ve found an interesting dataset, then you can start to formulate a project idea. For example, if you have salary data for municipal employees, you could analyze how those salaries vary by a city’s political affiliation. Are employees in Democratic-controlled areas paid more than Republican-controlled or vis versa? What other correlations could affect this relationship?

Once you understand those interactions, you could train a machine learning model to predict salary based on a variety of input, from years of experience to population density.

It is much harder to start with an idea and then scour the internet trying to find the perfect publicly available dataset. Make your life easier, and find the data first.

Closing

All that being said, the most important piece of advice I can give is to have fun.

You’re making this career change for a reason, and if you’re not enjoying the learning process, then you might be on the wrong path. Becoming a data scientist is not about the money or the prestige—it’s about the delight of solving puzzles, the joy of discovering patterns, the gratification of making a measurable impact. I sincerely hope that you find your career in data as satisfying as I’ve found mine so far.

So buckle up and enjoy the ride!

A guide for data science self-study

My path to becoming a data scientist has been untraditional.

After receiving a B.S. in chemical engineering, I worked as a process engineer for a wide range of industries, designing manufacturing facilities for products as varied as polyurethanes, pesticides, and Grey Poupon mustard.

Tired of long days on my feet starting up production lines and longing for an intellectual challenge, I discovered data science in 2017 and decided to pivot my career.

I participated in Springboard’s part-time online bootcamp and managed to land a job as a junior data scientist shortly thereafter. But my journey to learning data science was really only just beginning.

Besides the bootcamp, all my data science skills are self-taught. Fortunately, today’s era of education democratization has made that kind of path possible. For those that are interested in pursuing their own course of self-study, I’m including my recommended classes/resources below.

Python

MIT OCW’s Introduction to Computer Science and Programming

I enjoy lecture-style classes with corresponding problem sets, and I thought this class catapulted my Python skills farther and faster than a lot of the online interactive courses like DataCamp.

Additionally, this course covers more advanced programming topics like recursion that—at the time—I hadn’t thought were necessary for data scientists. Fast-forward six months when I was asked a question on recursion during the interview for my first data science job! I was so grateful to this course for providing a really solid education in Python and general coding practices.

Algorithms and Data Structures

Coursera’s Data Structures and Algorithms Specialization

I only completed the first two courses (Algorithmic Toolbox and Data Structures) of the specialization but I don’t believe the more advanced topics are necessary for your average data scientist.

I can’t recommend these courses highly enough. I originally had enrolled just hoping to become more conversant with common algorithms like breadth-first search but I found myself using these concepts and ways of thinking at my job.

The professors who designed these online classes have done a fantastic job of incorporating games to improve your intuition about a strategy and designing problem sets that force you to truly understand the material. There’s no fill-in-the-blank here—you’re given a problem and you must code up a solution.

I also recommend starting with the Introduction to Discrete Mathematics for Computer Science specialization, even if you already have a technical background. You’ll want to make sure you have a solid foundation in those concepts before undertaking the DS&A specialization.

Linear Algebra

MIT OCW’s Linear Algebra

I needed a refresher on linear algebra after barely touching a matrix in the ten years since my university days. And MIT’s videotaped lectures from 2010 with associated homework and quizzes was a great way to cover the basics.

The quality of this course is entirely thanks to Professor Gilbert Strang, who is passionate about linear algebra and passionate about teaching (a rare combination). He covers this subject at an approachable level that doesn’t require much complicated math.

I did supplement this course with 3Blue1Brown‘s YouTube series on Linear Algebra. These short videos can really help visualize some of these concepts and build intuition.

Machine Learning

Andrew Ng’s Machine Learning

Taking this course is almost a rite of passage for anyone choosing to learn data science on their own. Professor Andrew Ng manages to convey the mathematics behind the most common machine learning algorithms without intimidating his audience. It’s a wonderful introduction to the ML toolbox.

My one gripe with this course is that I didn’t feel like the homework really added to my understanding of the algorithms. Most of the assignments required me to fill in small pieces of code, which I was able to do without fully comprehending the big picture. I took the course in 2017, however, so it’s possible this aspect of the class has improved.

Deep Learning

fastai’s Practical Deep Learning for Coders

Andrew Ng’s Deep Learning course is just about as popular as the Machine Learning course I recommended above. But after completing his DL class, I only had a vague understanding of how neural networks are constructed without any idea of how to train one myself.

The folks at fastai take the opposite approach. They give you all the tools to build a neural network in the first few lessons and then spend the remaining chapters digging into opening the black box and discussing how to improve the performance. This is a much more natural way of learning and leads to better retention upon course completion.

There are videotaped lectures discussing these concepts but I would recommend just reading the book because the lectures don’t add any new material. The book is actually a series of Jupyter notebooks, allowing you to run and edit the code.

Closing

I will warn that this path is not for everyone. There were many times when I wished I could work through a problem with a classmate or dig deeper into a concept with the professor. Online discussion forums for these kinds of classes are not the same as real-time feedback. Perseverance, self-reliance, and a lot of Googling are all necessary to get the most out of a self-study program.

The variety of skills and knowledge data scientists are supposed to have can be overwhelming to newcomers in this field. But just remember—no one knows it all! Simply embrace your identity as a lifetime learner, and enjoy the journey.