Data Analysis and Decision Modelling

    Assignment Requirements

     

    So basically I have the Excel chart and the example assignment from my friend. What I need you to do is to use the Excel chart data and apply it into my friend’s assignment. Not to copy just use your own word and make it like a new work but use my friend’s assignment as reference. and Probably add more definitions for it.

    So to sum up. All the numbers/data you will need to use from the excel chart and writing style I want you to follow my friend’s assignment as reference.

    I have only 12hours. so please let me know if anything you dont understand now cuz I’ll be sleeping in 2hours(3am) cuz it’s 1am now.

    I will upload all the document for you.

    Introduction

    The purpose of this essay is going to illustrate that relationship and interactions between variable factors and regression model, and then predicting future changes of company – Apple’s share price in share market. This report is constructed by data collecting, distinguishing and built up a multiple regression model to analysing information and graphs, such as P value, VIF and F test etc.

     

    Original Data:

    Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
    12/11/2013 0.021007 0.039063 0.182188 0.103333 0.291806 0.2625
    11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667
    8/11/2013 0.08724 0.015625 0.146979 0.234896 0.372431 0.395833
    7/11/2013 0.034236 0.012708 0.110677 0.360972 0.045833 0.658333
    6/11/2013 0.076563 0.197777 0.095 0.475694 0.331459 0.320833
    5/11/2013 0.054861 0.235989 0.011458 0.375 0.335313 0.325
    4/11/2013 0.125417 0.393021 0.014236 0.155694 0.473611 0.254167
    1/11/2013 0.079132 0.5875 0.038889 0.055087 0.516076 0.541667
    31/10/2013 0.076823 0.60783 0.201562 0.040399 0.539688 0.554167
    30/10/2013 0.209792 0.092083 0.196875 0.014844 0.437708 0.445833
    29/10/2013 0.068177 0.395416 0.382673 0.022083 0.455381 0.745833
    11/02/2013 0.416666 0.129167 0.444618 0.455381 0.119401 0.408333
    8/02/2013 0.628334 0.551597 0.189496 0.367362 0.189844 0.6
    7/02/2013 0.847084 0.477969 0.148541 0.015625 0.053437 0.916667
    6/02/2013 0.320833 0.364653 0.183768 0.012413 0.13125 0.929167
    5/02/2013 0.824913 0.322917 0.193958 0.004375 0.305278 0.3
    4/02/2013 0.075347 0.362465 0.232916 0.005208 0.195972 0.641667
    1/02/2013 0.496875 0.44625 0.008594 0.006094 0.385486 0.475
    31/01/2013 0.419966 0.30586 0.019201 0.022656 0.407057 0.820833

     

    Those data which applied in this essay are collected from Yahoo finance (Yahoo finance, 2013). Yahoo finance contains a range of statistics, and those data is an open resource to use. Therefore, this report picked up 5 different independent variables from Yahoo finance database to analysing the fluctuation of share price and predicting future changes of share price.

     

     

     

     

    Identification of Variables

    Y variable

    Variable: future change

    X variable

    Variable 1: Gold_x_AAPL_acc1

    The figure shows that relationship between gold and Apple.

     

    Variable 2: Silver_vel10_x_Baltic_dry_acc2

    The number asserts that raw material and shipping goods has impacted on Apple’s business operation.

     

    Variable 3: Oil_vel5_x_AAPL_vel7

    The information points out that the relevance between oil prices and the cost of shipping goods worldwide.

     

    Variable 4: Baltic_dry_vel5_x_AAPL_vel11

    According to the figure to understanding the cost of shipping goods worldwide, such as deliver finish goods from manufactory to markets.

     

    Variable 5: 5year_vel4_x_SP500_vel12

    By comparing the short term share price with SP500 figure to demonstrates Apple’s performing in the market.

     

    Check inputs for collinearity

    Collinearity is the undesirable situation where the correlations among the independent variables are strong. Collinearity will misleadingly inflatethe standard errors (Evans, 2013). Thus, it makes some variables statistically may be found not to be significantly different from 0 while they should be otherwise significant.

    First of all, we want to see if any of those inputs have collinearity by using PHStat Multiple Regression function to analysis VIF.VIF, variance inflation faction is a figure to understand issue in linear model. When independent variable has got linear issue, then it means there are some similar explanations between independent variables. Moreover, the issue drives researcher could not demonstrate how many degree of impact on Y. Researcher would recognise there is no linear problem when the number of VIF smaller. If VIF for one of the variables is around or greater than 5, there is collinearity associated with that variable, one of these variables must be removed from the regression model (Evans 2013). As can be seen from the case, all 5 independent variable’s VIF are less than 5, therefore, there is no linear issue in this report.

    From the result’s outputs show that as below:

     

     

    Regression Analysis

    5year_vel4_x_SP500_vel12 and all other X
    Regression Statistics
    Multiple R 0.2601
    R Square 0.0677
    Adjusted R Square 0.0484
    Standard Error 0.2053
    Observations 199
    VIF 1.0726
    Regression Analysis
    Baltic_dry_vel5_x_AAPL_vel11 and all other X
    Regression Statistics
    Multiple R 0.4759
    R Square 0.2265
    Adjusted R Square 0.2105
    Standard Error 0.1770
    Observations 199
    VIF 1.2928

     

    Regression Analysis
    Silver_vel10_x_Baltic_dry_acc2 and all other X
    Regression Statistics
    Multiple R 0.3629
    R Square 0.1317
    Adjusted R Square 0.1138
    Standard Error 0.2020
    Observations 199
    VIF 1.1517

     

    Regression Analysis
    Oil_vel5_x_AAPL_vel7 and all other X
    Regression Statistics
    Multiple R 0.5462
    R Square 0.2984
    Adjusted R Square 0.2839
    Standard Error 0.1985
    Observations 199
    VIF 1.4253

     

    Regression Analysis
    Gold_x_AAPL_acc1 and all other X
    Regression Statistics
    Multiple R 0.1711
    R Square 0.0293
    Adjusted R Square 0.0093
    Standard Error 0.1758
    Observations 199
    VIF 1.0301

     

    From the output table shows that all the VIF are less than 5.

     

     

    Check residuals are normally distributed

    The other way to check whether the inputs have collinearity, we will check whether the residuals are normally distributed. By using the excel function data analysis we can compute the following graphs.

    From the last Graph, it is a straight line (or almost straight), the residuals are normally distributed, and all our hypothesis tests will be accurate.

     

    Check which inputs are helping.

    In this step, we are going to check for two important figures – adjust R square and P-value.

    R square

    In the regression model, R square is presenting the proportion of variation which clarifies by independent variables. As the result, R square could be a standard of the degree of accuracy on the prediction between X and Y. The ratio of the regression sum of squares (SSR) to the total sum of squares (SST) is resolved R square’s value. Furthermore, adjusted R square is the number which excludesor reduces impacts from other independent variables.

    As can be seen from the regression analysis, the number of adjusted R square is 0.158405, meanwhile, it pointed out that there is a weak relationship between X and Y by 15% only. Therefore, even there may some other elements to effect on variable Y but the outcome would not be different due to the feeble relationship amid Y and Xs. However, the relationship is still statistically significant.

    Regression Statistics
    Multiple R 0.42386
    R Square 0.179658
    Adjusted R Square 0.158405
    Standard Error 0.249028
    Observations 199

     

      Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
    Intercept 0.497323086 0.041719076 11.92076 6.581E-25 0.415039232 0.579606941 0.415039232 0.579606941
    Gold_x_AAPL_acc1 0.18629918 0.101725034 1.8314 0.06858258 -0.014336327 0.386934688 -0.014336327 0.386934688
    Silver_vel10_x_Baltic_dry_acc2 0.377818335 0.088513082 4.268503 3.083E-05 0.203241179 0.552395491 0.203241179 0.552395491
    Oil_vel5_x_AAPL_vel7 0.108244958 0.090080908 1.201641 0.23097488 -0.069424471 0.285914387 -0.069424471 0.285914387
    Baltic_dry_vel5_x_AAPL_vel11 -0.256465655 0.100999 -2.539289 0.01189634 -0.455669182 -0.05726213 -0.455669182 -0.057262129
    5year_vel4_x_SP500_vel12 -0.361791706 0.087098442 -4.153825 4.9072E-05 -0.533578722 -0.19000469 -0.533578722 -0.19000469

     

    In the output sheet, the Adjusted R square is 0.158405. And we look further down where we have to check each P-value for each input. If input has a p-value that is more than 0.05, then we should consider deleting that whole column of inputs from the original data. As the above table shows that there are two inputs data P- value greater than 0.05, so that we should go back to original data to delete those two columns to have look how Adjust R square changed.

     

    Date Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
    12/11/2013 0.039063 0.103333 0.291806 0.2625
    11/11/2013 0.064445 0.253993 0.301476 0.316667
    8/11/2013 0.015625 0.234896 0.372431 0.395833
    7/11/2013 0.012708 0.360972 0.045833 0.658333
    6/11/2013 0.197777 0.475694 0.331459 0.320833
    5/11/2013 0.235989 0.375 0.335313 0.325
    4/11/2013 0.393021 0.155694 0.473611 0.254167
    1/11/2013 0.5875 0.055087 0.516076 0.541667
    31/10/2013 0.60783 0.040399 0.539688 0.554167
    30/10/2013 0.092083 0.014844 0.437708 0.445833
    29/10/2013 0.395416 0.022083 0.455381 0.745833
    11/02/2013 0.129167 0.455381 0.119401 0.408333
    8/02/2013 0.551597 0.367362 0.189844 0.6
    7/02/2013 0.477969 0.015625 0.053437 0.916667
    6/02/2013 0.364653 0.012413 0.13125 0.929167
    5/02/2013 0.322917 0.004375 0.305278 0.3
    4/02/2013 0.362465 0.005208 0.195972 0.641667
    1/02/2013 0.44625 0.006094 0.385486 0.475
    31/01/2013 0.30586 0.022656 0.407057 0.820833

     

    After delete those two columns, output tables will show as below:

    Regression Statistics  
    Multiple R 0.399072  
    R Square 0.159258  
    Adjusted R Square 0.146324  
    Standard Error 0.250809  
    Observations 199  
      Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
    Intercept 0.53142673 0.038261011 13.8895108 6.05374E-31 0.455968204 0.606885246 0.455968204 0.606885246
    Silver_vel10_x_Baltic_dry_acc2 0.42743504 0.085345172 5.008309548 1.22576E-06 0.259116945 0.595753133 0.259116945 0.595753133
    Baltic_dry_vel5_x_AAPL_vel11 -0.206687 0.09149364 -2.2590309 0.024988049 -0.38713109 -0.02624283 -0.38713109 -0.02624283
    5year_vel4_x_SP500_vel12 -0.3276067 0.085414441 -3.83549547 0.000169117 -0.49606141 -0.159152 -0.49606141 -0.159152

     

    After delete two columns of data Adjusted R square becomes smaller from 0.158405131316566 to 0.146324.Because after delete 2 columns data, adjust R square become smaller. Hence we should un-delete that last column we just deleted. Consider about the change of the Adjust R square, instead of delete two columns in the original data, we are going to delete the column with higher P-value. Table will show as below.

    Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
    12/11/2013 0.021007 0.039063 0.103333 0.291806 0.2625
    11/11/2013 0.027378 0.064445 0.253993 0.301476 0.316667
    8/11/2013 0.08724 0.015625 0.234896 0.372431 0.395833
    7/11/2013 0.034236 0.012708 0.360972 0.045833 0.658333
    6/11/2013 0.076563 0.197777 0.475694 0.331459 0.320833
    5/11/2013 0.054861 0.235989 0.375 0.335313 0.325
    4/11/2013 0.125417 0.393021 0.155694 0.473611 0.254167
    1/11/2013 0.079132 0.5875 0.055087 0.516076 0.541667
    31/10/2013 0.076823 0.60783 0.040399 0.539688 0.554167
    30/10/2013 0.209792 0.092083 0.014844 0.437708 0.445833
    29/10/2013 0.068177 0.395416 0.022083 0.455381 0.745833
    11/02/2013 0.416666 0.129167 0.455381 0.119401 0.408333
    8/02/2013 0.628334 0.551597 0.367362 0.189844 0.6
    7/02/2013 0.847084 0.477969 0.015625 0.053437 0.916667
    6/02/2013 0.320833 0.364653 0.012413 0.13125 0.929167
    5/02/2013 0.824913 0.322917 0.004375 0.305278 0.3
    4/02/2013 0.075347 0.362465 0.005208 0.195972 0.641667
    1/02/2013 0.496875 0.44625 0.006094 0.385486 0.475
    31/01/2013 0.419966 0.30586 0.022656 0.407057 0.820833

     

    Regression Statistics
    Multiple R 0.416557415
    R Square 0.17352008
    Adjusted R Square 0.156479257
    Standard Error 0.249313166
    Observations 199

    Compare with these three Adjust R square, after deleting either two columns or one column, both of the adjust R square results are smaller than the original one. Thus, we have to make a decision that we should not take that action to delete those columns.

     

    Make a prediction with a confidence interval

    To predict the future change, we will use PHStat’s Confidence interval estimate & Prediction function with 95% confidence intervals. And we will use the original date show below to fill in the table to predict future changes.

    Date Gold_x_AAPL_acc1 Silver_vel10_x_Baltic_dry_acc2 Oil_vel5_x_AAPL_vel7 Baltic_dry_vel5_x_AAPL_vel11 5year_vel4_x_SP500_vel12 Future_change
    11/11/2013 0.027378 0.064445 0.235156 0.253993 0.301476 0.316667

     

    Data
    Confidence Level 95%
    1
    Gold_x_AAPL_acc1 given value 0.027378
    Silver_vel10_x_Baltic_dry_acc2 given value 0.064445
    Oil_vel5_x_AAPL_vel7 given value 0.235156
    Baltic_dry_vel5_x_AAPL_vel11 given value 0.253993
    5year_vel4_x_SP500_vel12 given value 0.301476

     

    For Average Predicted Y (YHat)
    Interval Half Width 0.05934
    Confidence Interval Lower Limit 0.318675
    Confidence Interval Upper Limit 0.437354
       
    For Individual Response Y
    Interval Half Width 0.494738
    Prediction Interval Lower Limit -0.11672
    Prediction Interval Upper Limit 0.872753

    As can be seen from the graph as above, the analysing presented that number of interval half width is standing on the 0.49, almost 0.5 for individual response Y. Hence, there is an optimistic sign for predictors and investors as a reference. However, there is not really encourage investors to doing investment due to the figure of prediction is not reached 1 yet. So, there is still have some risks to impact on the share price. Also, the prediction may not accurate enough because there is a lack of related samples to analysis. On the other hand, share market has got plenty of uncertainties, therefore, the report user can not just rely on data analysing, also still need to think twice after apply this report and before doing investment.

     

    The Durbin-Watson test

    Durbin-Watson Calculations
    Sum of Squared Difference of Residuals 24.72287316
    Sum of Squared Residuals 11.96892238
    Durbin-Watson Statistic 2.065588896

     

    The Durbin-Watson test is applied for examine the presence of serial correlation residual. The value of Durbin-Watson statistic ranges from 0 to 4, and the acceptable range is from 1.50 to 2.50 (Evans, 2013). According to the graph, it shows our Durbin-Watson statistic is 2.065588896. As the result, the outcome is acceptable.If the Durbin-Watson statistic was not in the acceptable range, we would add a caution to the findings for a violation of regression assumptions.

     

     

    Conclusion

    To sum up, according to the regression model analysing, there is no so much influences on future change by 5 independent variables. Originally, we indicate that gold and silver are both important raw materials for products and also paly as a financial solver to helps company to balance revenue and make up a deficit from international exchange rates. However, those two materials are not having as high relevance with share price as expected. It might cause both raw materials are not key element of products and also Apple’s financial apartment would focus on other investments rather than gold and silver markets to reduce risks. On the other hand, oil and Baltic index are another point of view to forecasting Apple’s future change. Unfortunately, those two factors have no strong relationship with Apple’s share price. The reason to drives those factors have no such impacts on Apple’s share price might due to different shipping and investment strategy to avoid risks. What is more, we also brought SP500 to compare with Apple’s business performances and predicted future change of share price. However, the figure illustrated that Apple’s share price has no strong relevance with SP500 as well. As the result, Apple’s business performing and SP500 is unhooked. In other words, Apple’s business is operating more independently.

    References

     

    Evans, J. R. (2013). Statistics, Data Analysis and Decision Modeling (5th ed.). Harlow, England: Pearson Education Limited

    Yahoo Finance. (2014). Apple Profil. Retrived January 04, From http://finance.yahoo.com/q/pr?s=aapl

     

    Order Now

    http://zelessaywritings.com/order/

                                                                                                                                      Order Now