Scientific journal

ISSN 1814-2400

INFORMATION SCIENCE AND CONTROL SYSTEMS

Grigor’ev Yu.A., Ermakov E. Y., Proletarskaya V. A.

EXPERIMENTAL EFFICIENCY VERIFICATION OF AN ACCESS METHOD TO THE STORAGE DATA ON THE SPARK PLATFORM USING CASCADING BLOOM FILTER

Using TPC-H Q3 query there was made a comparison of two access methods to the storage data: the developed method with the cascading Bloom filter and without using the Bloom filter. To this end, there were conducted full-scale experiments in a cluster environment of 8 nodes on the Apache Spark parallel computing platform. The results of the experiments confirmed the advantages of the developed method with the cascading Bloom filter

Keywords: Spark SQL, Bloom filter, TPC-H, Q3 query, intercomparison of methods