Pinjari Akbar – Medium

Pinjari Akbar

Pinned

MODULE 1 : BIG DATA — THE BIG PICTURE |Comparison Between Monolithic and Distributed Systems

Monolithic Systems

Jul 31, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE |Comparison Between Monolithic and Distributed Systems

Jul 31, 2024

Pinned

Pyspark StructType and StructField ?

let’s In PySpark, StructType and StructField are classes used for defining the schema of a DataFrame. They allow you to specify the…

Nov 11, 2023

Pyspark StructType and StructField

Nov 11, 2023

Advanced SQL for Data Professionals

“Advanced SQL for Data Professionals” covers complex and performance-optimized SQL techniques, which are essential for handling large…

Mar 13

Advanced SQL for Data Professionals

Mar 13

Future of IT Predicting the Next Technological Boom by 2035

Predicting the IT boom after 10 years (by 2035) involves analyzing emerging trends and technologies that are expected to shape the…

Mar 13

Future of IT Predicting the Next Technological Boom by 2035

Mar 13

Databricks Cost Optimization Methods ?

Databricks cost optimization involves managing resources effectively to reduce unnecessary costs while maximizing performance. Below are…

Dec 17, 2024

Databricks Cost Optimization Methods ?

Dec 17, 2024

How to read .xlsx file in PySpark ?

To read an .xlsx file in PySpark, you can use libraries like pyspark-excel or the openpyxl library in combination with PySpark's DataFrame…

Dec 17, 2024

How to read .xlsx file in PySpark ?

Dec 17, 2024

Creating PySpark and SQL tables dynamically without hardcoding

Creating SQL tables dynamically without hardcoding involves using scripts or templating mechanisms to generate the SQL code based on inputs…

Dec 15, 2024

Creating PySpark and SQL tables dynamically without hardcoding

Dec 15, 2024

Databricks Workflows in Real-Time Use with Spark

Databricks Workflows provide a robust orchestration tool to manage ETL pipelines, machine learning (ML) tasks, and data engineering…

Dec 15, 2024

Databricks Workflows in Real-Time Use with Spark

Dec 15, 2024

What is unity Catalog in Databricks

Unity Catalog is a unified governance solution introduced by Databricks for data and AI on its Lakehouse Platform. It is designed to…

Dec 15, 2024

What is unity Catalog in Databricks

Dec 15, 2024

Sql Project Overview: Sales and Operations Data Warehouse

This project involves the design and implementation of a data warehouse for a retail company. The data warehouse is structured to support…

Aug 12, 2024

Sql Project Overview: Sales and Operations Data Warehouse

Aug 12, 2024

Rank vs dense_rank in sql

Basically SQL, both RANK() and DENSE_RANK() are window functions used to assign rankings to rows within a partition of data. However, they…

Aug 12, 2024

Aug 12, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Database vs Data Warehouse vs Data Lake

Database vs Data Warehouse vs Data Lake

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Database vs Data Warehouse vs Data Lake

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Introduction to Apache Spark

Introduction to Apache Spark

Aug 2, 2024

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Types of Cloud Computing

Cloud computing can be classified into different types based on deployment models and service models. Understanding these classifications…

Aug 2, 2024

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Advantages of Cloud Computing

Advantages of Cloud Computing

Aug 2, 2024

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | COMPARISON BETWEEN ON-PREMISE AND CLOUD

Comparison Between On-Premise and Cloud

Aug 2, 2024

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Challenges with Hadoop

Challenges with Hadoop

Aug 2, 2024

Aug 2, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | Hadoop: Evolution, Overview, and Core Components

Evolution of Hadoop

Jul 31, 2024

Jul 31, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | INTRODUCTION TO BIG DATA

Introduction to Big Data

Jul 31, 2024

MODULE 1 : BIG DATA — THE BIG PICTURE | INTRODUCTION TO BIG DATA

Jul 31, 2024

Slowly changing data (SCD) Type 2 operation into Delta tables in spark

Implementing Slowly Changing Dimension (SCD) Type 2 operations with Delta tables in Apache Spark involves handling historical data changes…

Jul 27, 2024

Jul 27, 2024

Pinjari Akbar

Pinjari Akbar

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech