University of Belgrade, Faculty of organizational sciences

Department for e-business

Big data infrastructure and services

Study programInformation systems and management
Study groupE-business
Course statusElective
TeachersBožidar Lj. Radenković, Marijana S. Despotović-Zrakić, Zorica Bogdanović, Dušan Barać, Aleksandra Labus, Živko Bojović

Course Content

Methodology of scientific reseach in the field of big data infrastructure and services. Role of  the big data infrastructure and services in modern e-business. Big data infrastructure as a part of a cloud infrastructure. Hadoop ecosystem. Data storage in cloud infastructure. Analysis of distributed file systems. HDFS. Data structuring using HBase. NoSql databases. Comparative analysis of big data databases: GoogleBigTable, AmazonDynamo, MongoDB, Cassandra. Distributed processing of data in cloud infrastructure. MapReduce. Parallel programming and MapReduce. Distributed execution of complex MapReduce jobs. Prioritizing and tracking execution of MapReduce jobs in real time. Abstaction of Hadoop MapReduce jobs using Pig. Resource management within big data infrastructure. Load balancing. Technics and algorythms for big data analysis in e-business. Handling data manipulation problems in e-business: parallel sorting, Internet search, social network analysis, e-mail analysis. Integration of heterogeneous log files using Flume. Importing and exporting relational data using Sqoop. Ad hoc queries using Hive. Hive file formats. Execution of query using HiveQL. Real time queries using Impala. Analysis of non-structured data in aplications for e-business: pattern detection, online market segmentation, customer behaviour analysis, social network analysis, realization of real-time recommender systems, personalized online advertising. Apache Storm framework for handling data in real time. Advanced data analysis using Mahout. Data visualization. Designing and implementation of big data solutions. Deployment of solutions on production clusters. Performance optimization. Managing data and computing nodes. Analysis of research of connection between cloud computing, Internet of things and big data technologies. Analysis of the results of the latest researches in the field of big data with analysis of the most sagnificant references.

Aims

Aim of the subject is to make students capable of independent scientific research, modeling new solutions and solving topical problems in implementation of big data infrastructure and services in e-business.

Literature

  1. E-resources from www.elab.rs
  2. M.Despotović-Zrakić, V.Milutinović, A.Belić (Eds), High performance and cloud computing in scientific research and education, IGI Global, 2014.
  3. B. Radenković, M. Despotović-Zrakić, Z. Bogdanović, D. Barać, A.Labus, Elektronsko poslovanje, FON, 2015
  4. T.White, Hadoop: The Definitive Guide, O’Reilly Media, 2009.
  5. J.Dean, S.Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, OSDI’04: Sixth Symposium on Operating System Design and Implementation, San Francisco, CA, December, 2004. http://research.google.com/archive/mapreduce.html
  6. M.Minelli, M.Chambers,A.Dhiraj, Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses, Wiley, 2013