How good is the BankScope database? A cross-validation exercise with correction factors for market concentration measures

BIS Working Papers  |  No 133  | 
01 September 2003
The paper examines the quality of the BankScope database, by comparing results based on it to those obtained from the population-level data for India disseminated by the Reserve Bank of India. Despite good coverage and minor reporting errors in the individual reported units, strong evidence of selectivity bias in BankScope data for India is found. A major source of the selectivity bias for India is the almost total omission of Regional Rural Banks and Foreign Banks. It is shown that this selectivity bias affects estimates of all summary statistical measures and could lead users of the data to conclude that the Indian banking market is unimodal when, in reality, it is segmented and has a bimodal pattern. Kolmogorov-Smironov tests reveal that neither the distribution of the log of total assets nor that of market shares based on the BankScope data could be treated same as the corresponding population distributions for India. Despite these limitations, the paper shows that a few popularly used market concentration measures could be estimated from BankScope data accurately, provided the coverage ratio with respect to the size variable is known from alternative sources and is adequate. Coverage of about 90% with respect to the size variable is found to be sufficient for approximating population HHI. For k-bank concentration measures, accurate estimates could be obtained if, in addition, the top k banks in the population are also available in the sample. In contrast, for entropy measures, results indicate that adequate coverage with respect to both the size variable and the number of financial entities would be required.