5/08/2018

finish the NMR paper

Today, I finished the NMR paper.

Summary:

I sent you by an email.

I did three tests today for selecting inputs:
1. delete 10 inversion logs: R2 dropped to 0.7521.
2. delete 12 conventional logs: R2 dropped to 0.7889.
3. delete AT90 and VPVS logs: R2 dropped to 0.8423.

For the first two cases, apparent drops are obtained. So I recommend not to delete either of them. For the third case, the least important two inputs are deleted and similar accuracy as the original one is obtained. But since we have analyzed the importance of inputs in the paper, I think there is no need to delete them in this paper. Actually, I remembered that we discussed about it before. First, there is a big accuracy drop without either conventional logs or inversion logs. Second, since we have all of these 22 logs (without flags), it is better to use them together to predict NMR t2 distribution. There is no need to delete 2 or 3 of them if we want to keep similar accuracy.
I did not use k-fold cross-validation method because 85% training data and 15% testing data have been selected already before training and testing models. Instead, I did parallel computing for dividing data and training models. Almost the same accuracy is obtained so that I did not change results and figures in the paper.

The changed paper can be seen in the email.


NMR sequential prediction


  1. column d is prediction of every group in one ANN model separately.
  2. column e is sequential prediction.
Column a is accuracy of predicting 64 bins together and measuring accuracy of every group while normalized using maximum and minimum values of 64 bins. Column b is the ranking result. Column c is the number of bins in every group.
From the results, we can see that the two methods that we discussed yesterday do not perform better than predicting 64 bins together. The reason may be:
  1. 64 bins together represent the distribution of pore size.
  2. some group just means some pore size, which cannot be closely correlated to those input logs. The concentration of pore size at every depth is different from each other.
  3. what matters is the dominated pore size at every depth, which is different from each other.
I think the sequential prediction for NMR and DD logs are different. 8 DD logs are kind of independent of each other at different frequencies, each representing conductivity or permittivity of the reservoir. However, 64 bins and 8 groups at every depth represent pore size distribution together, which cannot be divided. I think that is why the two methods cannot predict NMR t2 distribution separately with high accuracy.

Tomorrow, I will select inputs and do k-fold cross validation.