AMA #1 Questions and Answers.

Your most burning questions from AMA #1.

When do I use [] or ()?

Generally speaking, [] are for creating lists, while () are for tuples.

Given objects, [] are for subsetting data, while () are for providing parameters for a function.

You can refer to these handy cheat sheets on DataCamp.

Are Time Series Analyses like ARIMA models or deep learning more widely used in the industry?

It depends on the complexity of the problem and amount of resources you have (time and computational power). The last thing we want is to use a saw to cut a slab of butter.

Although deep learning methods are becoming more popular for tackling time series problems that ARIMA models cannot solve efficiently today, ARIMA models are still being widely used in the industry.

Is doing simulations common in data scientist jobs?

It depends on the specific job scope, but it’s pretty common to run simulations for the purpose of generating additional data to train your deep learning networks.

What exactly is Data Science?

Check out this blog article for an insider’s view on what data science is, and how it may be applied to one company.

Can deep learning be used to resolve the multiplication of two features?

Yes. The fundamental unit in deep learning is the neuron. It is therefore theoretically possible to simulate and accomplish any mathematical function.

What are the other essential computing language(s) needed to have a career in AI/ data science?

Besides Python/ R, it would be good to know bash scripting. You should focus on one and gain mastery in it first instead of learning the basics of multiple languages. IF you have to pick one, we recommend Python.

How different in terms of capability are the more widely-used plotting tools like R’s ggplot2, python’s matplotlib, and non-scripting tools like Tableau?

Usually, matplotlib and ggplot2 are used to quickly get insight into your data. Seaborn is a great library for statistical visualisation and works very well with matplotlib. For non-scripting tools like Tableau, they are used to present data more formally to your audience (although matplotlib can also serve that function).

What benefits do pandas dataframes have over using class objects in organising data?

Pandas dataframes come with many inbuilt functions for slicing, modifying, and analysing the data set. In addition, there are many libraries that can work directly with Pandas (e.g. scikitlearn for machine learning). If you have ever used a spreadsheet, think of Pandas as the equivalent of Excel for Python!

Are there any good resources for the math behind machine learning?

Andrew Ng courses are often touted as a favorite.

Any tips/ cheat sheets?

Check out this link on DataCamp for nifty cheat sheets.

Watch AMA #1 here for a quick recap. 
Then tune in on Wednesday, 20 November at 1PM for AMA #2!

Leave a Comment