UE19CS322 - Big Data

ml
systems
cs
sem5
Author

Vibha Masti

Published

December 1, 2021

Course instructors: Prof. K. V. Subramaniam


Syllabus and Class Notes

Unit Syllabus Vibha’s Notes
Unit 1 Big data definition, Map-reduce, Storage (HDFS) BD Unit 1.pdf
Unit 2 Compute and Storage, Hadoop ecosystem, Pagerank, matrix multiplication, HIVE BD Unit 2.pdf
Unit 3 In-memory computation, Scala/PySpark programming model, Transformations and Actions, SQL, Spark architecture, RDD, DF, wide and narrow dependencies, complexity of BD algorithms BD Unit 3.pdf
Unit 4 Streaming analysis, Kafka, Bloom filters BD Unit 4.pdf
Unit 5 Advanced analytics on big data, clustering algorithms, collaborative filtering, scaling NNs for big data BD Unit 5.pdf