Benchmarking of Databases for Big Data Exploration in the Social Graph Analysis

Autor

Krzysztof Węcel Poznań University of Economics and Business, Poland
Bartosz Perkowski Poznań University of Economics and Business, Poland
Agata Filipowska Poznań University of Economics and Business, Poland
Dawid Węckowski Agora S.A.
Piotr Zwolenkiewicz Valeant Pharmaceuticals

DOI:

https://doi.org/10.18559/SOEP.2017.12.1

Słowa kluczowe:

Benchmarking, Model LBG, Analiza danych, Bazy danych, Big Data, Systemy wysokiej wydajności

Abstrakt

W niniejszym artykule prezentujemy wyniki benchmarkingu różnych systemów baz danych potencjalnie użytecznych przy analizie dużych sieci społecznych. Wykorzystując dane o zdarzeniach komunikacyjnych pomiędzy 7 milionami użytkowników, porównywaliśmy następujące rozwiązania: MySQL, Neo4J, Titan oraz Virtuoso. Zaproponowaliśmy przykładowe zapytania odpowiadające rzeczywistym scenariuszom wynikającym z analizy wymagań właściciela danych. Czasy odpowiedzi były mierzone w grupach zapytań podstawowych, agregujących i sieciowych. MySQL najlepiej sprawdził się w zapytaniach wymagających obliczeń, podczas gdy Virtuoso dominował w zapytaniach sieciowych.

Pobrania

Statystyki pobrań niedostępne.

Bibliografia

Angles, R., Gutierrez, C., 2008, Survey of Graph Database Models, ACM Computing Surveys, vol. 40, iss. 1, s. 1-39.

Angles, R., Prat-Pérez, A., Dominguez-Sal, D., Larriba-Pey, J.-L., 2013, Benchmarking Database Systems for Social Network Applications, in: First International Workshop on Graph Data Management Experiences and Systems, New York, NY, s. 15:1-15:7.

Armstrong, T.G., Ponnekanti, V., Borthakur, D., Callaghan, M., 2013, LinkBench: A Database Benchmark Based on the Facebook Social Graph, in: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, New York, NY, s. 1185-1196.

Barmpis, K., Kolovos, D.S., 2012, Comparative Analysis of Data Persistence Technologies for Large-scale Models, in: Proceedings of the 2012 Extreme Modeling Workshop, Innsbruck, s. 33-38.

Bouillet, E., Kothari, R., Kumar, V., Mignet, L., Nathan, S., Ranganathan, A., Turaga, D.S., Udrea, O., Verscheure, O., 2012, Processing 6 Billion CDRs/day: From Research to Production (Experience Report), in: Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems, Berlin, s. 264-267.

Cattell, R., 2011, Scalable SQL and NoSQL Data Stores, SIGMOD Rec., vol. 39, iss. 4, s. 12-27.

Ciglan, M., Averbuch, A., Hluchy, L., 2012, Benchmarking Traversal Operations over Graph Databases, 2013 IEEE 29th International Conference on Data Engineering Workshops, s. 186-189.

Deng, C., Qian, L., Xu, M., Du, Y., Luo, Z., Sun, S., 2012, Federated Cloud-based Big Data Platform in Telecommunications, in: Proceedings of the 2012 Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit, San Jose, CA, s. 44-48.

Dominguez-Sal, D., Urbon-Bayes, P., Gimenez-Vano, A, Gomez-Villamor, S., Martinez- Bazan, N., Larriba-Pey, J., 2010, Survey of Graph Database Performance on the HPC Scalable Graph Analysis Benchmark, WAIM 2010 Workshops, s. 37-48.

Erling, O., Mikhailov, I., 2010, Virtuoso: RDF Support in a Native RDBMS, in: Virgilio, R., de, Giunchiglia, F., Tanca, L. (eds.), Semantic Web Information Management, Springer, Berlin Heidelberg, s. 501-519.

Garulli, L., 2011, GraphDB Benchmark part II [https://zion-city.blogspot.com/2011/04/ graphdb-benchmark-part-ii.html, accessed 1.12.2017].

Holzschuher, F., Peinl, R., 2013, Performance of Graph Query Languages: Comparison of Cypher, Gremlin and Native Access in Neo4j, in: Proceedings of the Joint EDBT/ ICDT 2013 Workshops, Genoa, s. 195-204.

Lehmann, J. et al., 2015, DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia, Semantic Web Journal, vol. 6, no. 2, s. 167-195.

Macko, P., Margo, D., Seltzer, M., 2013, Performance Introspection of Graph Databases, in: Proceedings of the 6th International Systems and Storage Conference, Haifa, s. 18:1-18:10.

Magnusson, J., Kvernvik, T., 2012, Subscriber Classification within Telecom Networks Utilizing Big Data Technologies and Machine Learning, in: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, Beijing, s. 77-84.

Menon, A., 2012, Big Data @ facebook, in: Proceedings of the 2012 Workshop on Management of Big Data Systems, San Jose, CA, s. 31-32.

Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C., 2011, DBpedia SPARQL Benchmark - Performance Assessment with Real Queries on Real Data, in: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.), The Semantic Web - ISWC 2011, Lecture Notes in Computer Science, Springer, Berlin Heidelberg, vol. 7031, s. 454-469.

Paradies, M., 2012, Challenges in the Design of a Graph Database Benchmark [http:// www.slideshare.net/graphdevroom/challenges-in-the-design-of-a-graphdatabase- -benchmark, accessed 1.12.2007].

Rodriguez, M., 2012, Titan Provides Real-time Big Graph Data, [https://dzone.com/ articles/titan-provides-real-time-big, accessed 1.12.2007].

Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D., 2010, A Comparison of a Graph Database and a Relational Database: A Data Provenance Perspective, in: Proceedings of ACM Southeast Regional Conference, s. 42.

Pobrania

PDF (Angielski)

Opublikowane

31-12-2017

Numer

Tom 5 Nr 12 (2017)

Dział

Artykuły

Licencja

Utwór dostępny jest na licencji Creative Commons Uznanie autorstwa – Użycie niekomercyjne – Bez utworów zależnych 4.0 Międzynarodowe.

Lorem ipsum dolor sit amet quam leo, cursus vitae, commodo convallis consequat. Donec pulvinar porta neque, blandit risus commodo sit amet ante. Quisque condimentum. Donec orci interdum euismod scelerisque tincidunt. Maecenas vitae mi. Pellentesque orci vitae nunc venenatis tristique, convallis accumsan, dolor sit amet metus. Curabitur tempor. Phasellus sem. Quisque.

Jak cytować

Węcel, Krzysztof, Bartosz Perkowski, Agata Filipowska, Dawid Węckowski, and Piotr Zwolenkiewicz. 2017. “Benchmarking of Databases for Big Data Exploration in the Social Graph Analysis”. Czasopismo DEMO 5 (12): 5-21. https://doi.org/10.18559/SOEP.2017.12.1.

Pobierz cytowania

Benchmarking of Databases for Big Data Exploration in the Social Graph Analysis

Autor

DOI:

Słowa kluczowe:

Abstrakt

Pobrania

Bibliografia

Pobrania

Opublikowane

Numer

Dział

Licencja

Jak cytować

##plugins.generic.shariff.share##

Podobne artykuły

Inne teksty tego samego autora

Język / Language

Zrzutka

katalogi

##plugins.generic.webfeed.blockTitle##

Informacje

##plugins.block.browse##

Słowa kluczowe