Lists and Stats

I apologize for being out of town to give a talk (unfortunately it will happen again). Luckily, we have Marc.

This assignment explores the final data type we will cover in class, lists. You all have done an amazing job learning how to program in such a short time. Hopefully, you can solidify your knowledge. I think other students and professors in the department will be impressed with what you will accomplish by the end of this course.

You will have until next Thursday to complete this assignment. You are expected to come to class on Tuesday and work on this assignment unless you turn in your assignment to Marc and he OKs it (B+ or higher). Email your completed assignment yourlastnameList.py to marctomlinson@mail.utexas.edu.

1. Write a function that takes a list as an argument and asks the user to enter in the name of person the user would like to add to the list. The function should return the new list with the added name.

2. Write a function that takes a list as an argument and asks the user to enter in the name of person the user would like to delete from the list. If the user types in a name not on the list, ask them to try again. The function should return the new list without the deleted name.

3. Use the functions you wrote in Questions 1 and 2 to write a program that maintains a class roster. Present the user with a menu that asks whether the user would like to add or delete a name from the roster. Give a third option to finish. When finished, print out the final roster.

4. Write a function that takes two lists of numbers and subtracts the second list's numbers from the first list's numbers and returns a third list of the differences. For example, question4([4,3,7],[5,2,5]) should return [-1,1,2].

5. Write a function that squares every number of a list and returns the new list. For example, question5([-1,1,2]) should return [1,1,4].

6. Write a function that adds up all the numbers in a list and returns the sum. For example, question6([1,1,4]) should return 6.

7. Use the functions from questions 4-6 to write a function that calculates the variance of a list of numbers. For example, question7([1,2,6]) should return 7.

8. Write a function that takes a list and performs a one sample t-test and returns the t value. For example, question8([1,3,6]) should return 2.29.

9. Statistical tests do not give definitive answers. Sometimes they find significant differences (probabilities less than the alpha level) when there is not a true difference and other times they don't find differences when there is a true difference. We are going to focus on the former case. Using an alpha level of .05 means that 5% of the time a test will be significant when in fact there is no difference.

You are going to use your t-test function to verify what alpha means by simulation. You will test lists of numbers that are randomly sampled from a distribution that has mean 0. Therefore, the t-test should not show a significant difference from zero. We will consider random numbers drawn from three distributions. You will need to import the random module. To get one random number distributed normally with mean zero, use random.gauss(0,1). For an exponential distribution, use random.expovariate(1)-1. For a uniform distribution, use random.uniform(-1, 1).

When you sample these distributions (say get four numbers), the mean of these four numbers will not exactly be zero. The t-test will decide whether the mean is significantly different from zero. Using an alpha level of .05, it won't always get it right.

For each distribution, get a sample of four numbers and store it in a list. Send the list to your t-test function. If the result is greater than 3.18 or less than -3.18, you got a significant result. Using a loop, do this 1000 times for each distribution and calculate the percentage of times that significant results occur for each distribution. How do these percentages compare to the alpha level of .05? This might take a while to run so make sure your code works first just running it 10 times.

Now, repeat the process with samples of size 20 instead of four. The critical value is now plus or minus 2.09. What are the results? How do they compare to sample size four? What do you think is going on?