Email-essay scenario: Impressed with your work from last week, your boss has tasked you with leading the data science team effort for this project. Your first task is to make a plan for data gathering and collection. Your boss wants you to determine what data is necessary to answer the business questions and achieve the project’s objective. Your boss has asked you to send a proposed plan as an email, for the *first 2 weeks* of data collection, including:
What datasets will be needed
Why these datasets? How does the information that they contain inform the decision or answer business questions?
Which datasets exist internally?
If any datasets don’t already exist, specify how they will be collected.
If the data collection you’re imagining for the project would take more than 2 weeks, restrict the email to what can be done in two weeks
Use your knowledge of the cases / how businesses work to imagine what likely exists already internally at Salesforce and Netflix. This week’s video “Delivering High Quality Analytics at Netflix” will give you a sense of what sorts of data exists at Netflix, and help you imagine what data may exist at Salesforce.
Specify the data that will be collected and not necessarily what will be calculated from it. For example, for Salesforce, don’t specify that the dataset is ratios of men-to-women, specify you want the HR dataset that contains all employee IDs/names and their gender.
Minimum 300 words
Minimum 1 reference and with in line citations as appropriate