2/20/2017

Depth shifting and find poor fitting results

Today, I drew figures for logging data of depth shifting and found the poor fitting results.

Summary:

First, I search online and find the explanation for R2 to be negative.
R2 compares the fit of the chosen model with that of a horizontal straight line (the null hypothesis). If the chosen model fits worse than a horizontal line, then R2 is negative. Note that R2 is not always the square of anything, so it can have a negative value without violating any rules of math. R2 is negative only when the chosen model does not follow the trend of the data, so fits worse than a horizontal line.
A negative R2 is not a mathematical impossibility or the sign of a computer bug. It simply means that the chosen model (with its constraints) fits the data really poorly.

Second, I make sure that all the logging data I use are depth shifted.

Third, I found the fitting depths whose R2 is lower than 0.7 and 0.
There are 154 depths whose R2 are lower than 0.7. There are 26 depths whose R2 are lower than 0.
The first plot is for the first 16 dpeths lower than 0.7 and the second and third are for lower than 0.

I think although the values of 0.7 and 0 are very different, the fitting plots are similar. They all cannot fit the first declining curves or the last ascending curves.

For the last 5th small picture, it does not fit any peaks, which shows bugs in my code. After fixing bugs, There are 151 depths whose R2 are lower than 0.7. There are 25 depths whose R2 are lower than 0.
The first plot is for the first 25 depths lower than 0.7 and the second is for lower than 0.

Now the smallest R2 (about -0.4) depth is deleted automatically.

After adjusting for several times, the percentages of each category are shown as follows (3 means 3 or more).

Tomorrow, I will start to apply these data into R.

8 comments:

  1. For cases with sharp increase after 55 bin, can u write an algorithm to calculate R2 only for 1 to 55 bin and ignore the bins higher than 55 for calculation of R2... this will improve your R2 and will reduce the effect of these artifacts on the measurements.

    ReplyDelete
  2. For cases with sharp increase before 10 bin, can u write an algorithm to calculate R2 only for 11 to 64 bin and ignore the bins lower than 10 for calculation of R2... this will improve your R2 and will reduce the effect of these artifacts on the measurements...

    ReplyDelete
    Replies
    1. Ok, I will consider the first and this suggestion at the same time.

      Delete
  3. ensure these new changes to your algorithm doesnot affect your previous results for good cases.

    ReplyDelete