A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasets
This project is maintained by DataSystemsGroupUT
The following figures show the comparative representation of partitioning techniques (i.e. Horizontally, Subject-based, Predicate-based) for 100M, 250M, and 500M respectively (Excluding Hive).
< #### 500M Triples Partitioning techniques Ranking Scores