Posts

Showing posts from March, 2022

MongoDB and employees & departments

  MongoDB and employees & departments I know that lot of you made their first Oracle steps with the emp/dept repository. And lot of DBA/developers came  back often to these examples to test more complex queries. Even if you never played with Oracle emp/dept examples, I keep these examples enough simple. (Anyway, you can find here the exemples to play with: https://jastrebicdragutin.wordpress.com/2020/12/16/emp-and-dept-table-queries/ ) So let’s make the same steps with MongoDB and JSON documents. Since MongoDB does not anything about schemas, it will implicitly create one when you ask it to use it. use scott Mongo has its collections, which corresponds to RDBMS tables, and its documents, which are rows. All rows does not need to have the same structure in the JSON document, we will see it later. And you don’t need to explicitly create a collection, it will be implicitly created when you insert a first document in it. db.emp.insert({empno:1,ename: “Bob”, sal: 100000}) WriteResult({

Spark Optimize

Image
  You can refer to the advanced topics here - Optimization Techniques Joins Internal Working Delta Lake Spark 3.0 What is Spark? Apache Spark is a cluster computing platform designed to be fast and general-purpose. At its core, Spark is a “computational engine” that is responsible for scheduling, distributing, and monitoring applications consisting of many computational tasks across many worker machines or a computing cluster. What is a Spark Core? Spark Core contains the basic functionality of Spark, including components for task scheduling, memory management, fault recovery, interacting with storage systems, and more. Spark Core is also home to the API that defines resilient distributed datasets (RDDs), which are Spark’s main programming abstraction. RDDs represent a collection of items distributed across many compute nodes that can be manipulated in parallel. Spark Core provides many APIs for building and manipulating these collections. Key features of Spark - Spark can run over mul