Bringing together the AI community in Singapore – companies, startups, researchers, students, professionals – to collaborate, find research and business opportunities and talent.
- Have an interesting story to share?
- Seeking for AI talent for your organization?
- Seeking research interns for your labs?
- Seeking an industry partner for your AI projects?
- Are you a researcher and seeking an industry partner to do a POC or deployment of your IP/research outcomes?
Statistical Thinking in Python (Part 1)
Do you have any questions relating to Statistical Thinking in Python (Part 1)? Leave them here!
From "Distribution of no-hitters and cycles" exercise, i realized the peak of random variable generated by t1+t2 (both from np.random.exponential) is where the individual t1 and t2 distributions intersect. Why is this so? tau1 and tau2 were chosen to be different (1000, 500) to demonstrate the effect
Image here: https://img.frl/4hods
Also, if tau1 = tau2, the peak will occur at tau1 (or tau2) total waiting time, why is this so?
This lesson also teaches us that we can simulate any story to get its probability density function.
If i wanted to not just observe natural processes, but design a density function (such as an artificial game world) with the end shape in mind, am i able to compose (combine mathematical expressions of PDFs with known values of their corresponding parameters for each distribution) different basic component distributions to achieve that? The lesson makes it seem like we won't know how the combined shape looks like until we simulate it.
I don't understand the purpose of simulation. What do we do with it after that? If it is to calculate probabilities, don't the math expressions of the PDFs/CDFs provide that directly, and gives an even more precise answer than in the lesson where he extends lines from the CDF and reads rough values from the axes?
Assuming a distribution designer wants to achieve a peak at a certain x-value using the t1+t2 model as in the lesson, does simulation allow the function designer to tweak values of tau1 and tau2 until he can get the peak to move to his particular desired x-value? (so it acts like a trial-and-error tool).
Can i say if this designer knows how to manipulate the math, he has totally no need for simulation in designing his function? Eg. think about the distribution of the sum of values after rolling 2 dice. We don't need to simulate to know that it is symmetrical, its range of values is 2-12 and the triangular shape.
From 4. Thinking probabilistically-- Continuous variables, Introduction to the Normal distribution video, tutor mentioned
"To draw samples using np.random.normal, we need to provide parameters, the mean and std, to parameterise the (theoretical) normal distribution we are sampling on, the Mean and Std computed from the data are good estimates"
How did he conclude np.mean(michelson_speed_of_light) and np.std(michelson_speed_of_light) are good estimates as input to np.random.normal?
Just from seeing that the michelson_speed_of_light histogram looks normal compared to the normal pdf in the slide "Comparing data to a Normal PDF"?
Or because there is otherwise no other source of information to calculate mean/std for use in np.random.normal?
So what would be a 'bad estimate' here?